Networking
Configuring and managing cluster networking
Abstract
Chapter 1. About networking Copy linkLink copied to clipboard!
Red Hat OpenShift Networking is an ecosystem of features, plugins and advanced networking capabilities that extend Kubernetes networking with the advanced networking-related features that your cluster needs to manage its network traffic for one or multiple hybrid clusters. This ecosystem of networking capabilities integrates ingress, egress, load balancing, high-performance throughput, security, inter- and intra-cluster traffic management and provides role-based observability tooling to reduce its natural complexities.
The following list highlights some of the most commonly used Red Hat OpenShift Networking features available on your cluster:
Primary cluster network provided by either of the following Container Network Interface (CNI) plugins:
- OVN-Kubernetes network plugin, the default plugin
- OpenShift SDN network plugin
- Certified 3rd-party alternative primary network plugins
- Cluster Network Operator for network plugin management
- Ingress Operator for TLS encrypted web traffic
- DNS Operator for name assignment
- MetalLB Operator for traffic load balancing on bare metal clusters
- IP failover support for high-availability
- Additional hardware network support through multiple CNI plugins, including for macvlan, ipvlan, and SR-IOV hardware networks
- IPv4, IPv6, and dual stack addressing
- Hybrid Linux-Windows host clusters for Windows-based workloads
- Red Hat OpenShift Service Mesh for discovery, load balancing, service-to-service authentication, failure recovery, metrics, and monitoring of services
- Single-node OpenShift
- Network Observability Operator for network debugging and insights
- Submariner for inter-cluster networking
- Red Hat Service Interconnect for layer 7 inter-cluster networking
Chapter 2. Understanding networking Copy linkLink copied to clipboard!
Cluster Administrators have several options for exposing applications that run inside a cluster to external traffic and securing network connections:
- Service types, such as node ports or load balancers
-
API resources, such as
IngressandRoute
By default, Kubernetes allocates each pod an internal IP address for applications running within the pod. Pods and their containers can network, but clients outside the cluster do not have networking access. When you expose your application to external traffic, giving each pod its own IP address means that pods can be treated like physical hosts or virtual machines in terms of port allocation, networking, naming, service discovery, load balancing, application configuration, and migration.
Some cloud platforms offer metadata APIs that listen on the 169.254.169.254 IP address, a link-local IP address in the IPv4 169.254.0.0/16 CIDR block.
This CIDR block is not reachable from the pod network. Pods that need access to these IP addresses must be given host network access by setting the spec.hostNetwork field in the pod spec to true.
If you allow a pod host network access, you grant the pod privileged access to the underlying network infrastructure.
2.1. OpenShift Container Platform DNS Copy linkLink copied to clipboard!
If you are running multiple services, such as front-end and back-end services for use with multiple pods, environment variables are created for user names, service IPs, and more so the front-end pods can communicate with the back-end services. If the service is deleted and recreated, a new IP address can be assigned to the service, and requires the front-end pods to be recreated to pick up the updated values for the service IP environment variable. Additionally, the back-end service must be created before any of the front-end pods to ensure that the service IP is generated properly, and that it can be provided to the front-end pods as an environment variable.
For this reason, OpenShift Container Platform has a built-in DNS so that the services can be reached by the service DNS as well as the service IP/port.
2.2. OpenShift Container Platform Ingress Operator Copy linkLink copied to clipboard!
When you create your OpenShift Container Platform cluster, pods and services running on the cluster are each allocated their own IP addresses. The IP addresses are accessible to other pods and services running nearby but are not accessible to outside clients. The Ingress Operator implements the IngressController API and is the component responsible for enabling external access to OpenShift Container Platform cluster services.
The Ingress Operator makes it possible for external clients to access your service by deploying and managing one or more HAProxy-based Ingress Controllers to handle routing. You can use the Ingress Operator to route traffic by specifying OpenShift Container Platform Route and Kubernetes Ingress resources. Configurations within the Ingress Controller, such as the ability to define endpointPublishingStrategy type and internal load balancing, provide ways to publish Ingress Controller endpoints.
2.2.1. Comparing routes and Ingress Copy linkLink copied to clipboard!
The Kubernetes Ingress resource in OpenShift Container Platform implements the Ingress Controller with a shared router service that runs as a pod inside the cluster. The most common way to manage Ingress traffic is with the Ingress Controller. You can scale and replicate this pod like any other regular pod. This router service is based on HAProxy, which is an open source load balancer solution.
The OpenShift Container Platform route provides Ingress traffic to services in the cluster. Routes provide advanced features that might not be supported by standard Kubernetes Ingress Controllers, such as TLS re-encryption, TLS passthrough, and split traffic for blue-green deployments.
Ingress traffic accesses services in the cluster through a route. Routes and Ingress are the main resources for handling Ingress traffic. Ingress provides features similar to a route, such as accepting external requests and delegating them based on the route. However, with Ingress you can only allow certain types of connections: HTTP/2, HTTPS and server name identification (SNI), and TLS with certificate. In OpenShift Container Platform, routes are generated to meet the conditions specified by the Ingress resource.
2.3. Glossary of common terms for OpenShift Container Platform networking Copy linkLink copied to clipboard!
This glossary defines common terms that are used in the networking content.
- authentication
- To control access to an OpenShift Container Platform cluster, a cluster administrator can configure user authentication and ensure only approved users access the cluster. To interact with an OpenShift Container Platform cluster, you must authenticate to the OpenShift Container Platform API. You can authenticate by providing an OAuth access token or an X.509 client certificate in your requests to the OpenShift Container Platform API.
- AWS Load Balancer Operator
-
The AWS Load Balancer (ALB) Operator deploys and manages an instance of the
aws-load-balancer-controller. - Cluster Network Operator
- The Cluster Network Operator (CNO) deploys and manages the cluster network components in an OpenShift Container Platform cluster. This includes deployment of the Container Network Interface (CNI) network plugin selected for the cluster during installation.
- config map
-
A config map provides a way to inject configuration data into pods. You can reference the data stored in a config map in a volume of type
ConfigMap. Applications running in a pod can use this data. - custom resource (CR)
- A CR is extension of the Kubernetes API. You can create custom resources.
- DNS
- Cluster DNS is a DNS server which serves DNS records for Kubernetes services. Containers started by Kubernetes automatically include this DNS server in their DNS searches.
- DNS Operator
- The DNS Operator deploys and manages CoreDNS to provide a name resolution service to pods. This enables DNS-based Kubernetes Service discovery in OpenShift Container Platform.
- deployment
- A Kubernetes resource object that maintains the life cycle of an application.
- domain
- Domain is a DNS name serviced by the Ingress Controller.
- egress
- The process of data sharing externally through a network’s outbound traffic from a pod.
- External DNS Operator
- The External DNS Operator deploys and manages ExternalDNS to provide the name resolution for services and routes from the external DNS provider to OpenShift Container Platform.
- HTTP-based route
- An HTTP-based route is an unsecured route that uses the basic HTTP routing protocol and exposes a service on an unsecured application port.
- Ingress
- The Kubernetes Ingress resource in OpenShift Container Platform implements the Ingress Controller with a shared router service that runs as a pod inside the cluster.
- Ingress Controller
- The Ingress Operator manages Ingress Controllers. Using an Ingress Controller is the most common way to allow external access to an OpenShift Container Platform cluster.
- installer-provisioned infrastructure
- The installation program deploys and configures the infrastructure that the cluster runs on.
- kubelet
- A primary node agent that runs on each node in the cluster to ensure that containers are running in a pod.
- Kubernetes NMState Operator
- The Kubernetes NMState Operator provides a Kubernetes API for performing state-driven network configuration across the OpenShift Container Platform cluster’s nodes with NMState.
- kube-proxy
- Kube-proxy is a proxy service which runs on each node and helps in making services available to the external host. It helps in forwarding the request to correct containers and is capable of performing primitive load balancing.
- load balancers
- OpenShift Container Platform uses load balancers for communicating from outside the cluster with services running in the cluster.
- MetalLB Operator
-
As a cluster administrator, you can add the MetalLB Operator to your cluster so that when a service of type
LoadBalanceris added to the cluster, MetalLB can add an external IP address for the service. - multicast
- With IP multicast, data is broadcast to many IP addresses simultaneously.
- namespaces
- A namespace isolates specific system resources that are visible to all processes. Inside a namespace, only processes that are members of that namespace can see those resources.
- networking
- Network information of a OpenShift Container Platform cluster.
- node
- A worker machine in the OpenShift Container Platform cluster. A node is either a virtual machine (VM) or a physical machine.
- OpenShift Container Platform Ingress Operator
-
The Ingress Operator implements the
IngressControllerAPI and is the component responsible for enabling external access to OpenShift Container Platform services. - pod
- One or more containers with shared resources, such as volume and IP addresses, running in your OpenShift Container Platform cluster. A pod is the smallest compute unit defined, deployed, and managed.
- PTP Operator
-
The PTP Operator creates and manages the
linuxptpservices. - route
- The OpenShift Container Platform route provides Ingress traffic to services in the cluster. Routes provide advanced features that might not be supported by standard Kubernetes Ingress Controllers, such as TLS re-encryption, TLS passthrough, and split traffic for blue-green deployments.
- scaling
- Increasing or decreasing the resource capacity.
- service
- Exposes a running application on a set of pods.
- Single Root I/O Virtualization (SR-IOV) Network Operator
- The Single Root I/O Virtualization (SR-IOV) Network Operator manages the SR-IOV network devices and network attachments in your cluster.
- software-defined networking (SDN)
- OpenShift Container Platform uses a software-defined networking (SDN) approach to provide a unified cluster network that enables communication between pods across the OpenShift Container Platform cluster.
- Stream Control Transmission Protocol (SCTP)
- SCTP is a reliable message based protocol that runs on top of an IP network.
- taint
- Taints and tolerations ensure that pods are scheduled onto appropriate nodes. You can apply one or more taints on a node.
- toleration
- You can apply tolerations to pods. Tolerations allow the scheduler to schedule pods with matching taints.
- web console
- A user interface (UI) to manage OpenShift Container Platform.
Chapter 3. Accessing hosts Copy linkLink copied to clipboard!
Learn how to create a bastion host to access OpenShift Container Platform instances and access the control plane nodes with secure shell (SSH) access.
3.1. Accessing hosts on Amazon Web Services in an installer-provisioned infrastructure cluster Copy linkLink copied to clipboard!
The OpenShift Container Platform installer does not create any public IP addresses for any of the Amazon Elastic Compute Cloud (Amazon EC2) instances that it provisions for your OpenShift Container Platform cluster. To be able to SSH to your OpenShift Container Platform hosts, you must follow this procedure.
Procedure
-
Create a security group that allows SSH access into the virtual private cloud (VPC) created by the
openshift-installcommand. - Create an Amazon EC2 instance on one of the public subnets the installer created.
Associate a public IP address with the Amazon EC2 instance that you created.
Unlike with the OpenShift Container Platform installation, you should associate the Amazon EC2 instance you created with an SSH keypair. It does not matter what operating system you choose for this instance, as it will simply serve as an SSH bastion to bridge the internet into your OpenShift Container Platform cluster’s VPC. The Amazon Machine Image (AMI) you use does matter. With Red Hat Enterprise Linux CoreOS (RHCOS), for example, you can provide keys via Ignition, like the installer does.
After you provisioned your Amazon EC2 instance and can SSH into it, you must add the SSH key that you associated with your OpenShift Container Platform installation. This key can be different from the key for the bastion instance, but does not have to be.
NoteDirect SSH access is only recommended for disaster recovery. When the Kubernetes API is responsive, run privileged pods instead.
-
Run
oc get nodes, inspect the output, and choose one of the nodes that is a master. The hostname looks similar toip-10-0-1-163.ec2.internal. From the bastion SSH host you manually deployed into Amazon EC2, SSH into that control plane host. Ensure that you use the same SSH key you specified during the installation:
ssh -i <ssh-key-path> core@<master-hostname>
$ ssh -i <ssh-key-path> core@<master-hostname>Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Chapter 4. Networking Operators overview Copy linkLink copied to clipboard!
OpenShift Container Platform supports multiple types of networking Operators. You can manage the cluster networking using these networking Operators.
4.1. Cluster Network Operator Copy linkLink copied to clipboard!
The Cluster Network Operator (CNO) deploys and manages the cluster network components in an OpenShift Container Platform cluster. This includes deployment of the Container Network Interface (CNI) network plugin selected for the cluster during installation. For more information, see Cluster Network Operator in OpenShift Container Platform.
4.2. DNS Operator Copy linkLink copied to clipboard!
The DNS Operator deploys and manages CoreDNS to provide a name resolution service to pods. This enables DNS-based Kubernetes Service discovery in OpenShift Container Platform. For more information, see DNS Operator in OpenShift Container Platform.
4.3. Ingress Operator Copy linkLink copied to clipboard!
When you create your OpenShift Container Platform cluster, pods and services running on the cluster are each allocated IP addresses. The IP addresses are accessible to other pods and services running nearby but are not accessible to external clients. The Ingress Operator implements the Ingress Controller API and is responsible for enabling external access to OpenShift Container Platform cluster services. For more information, see Ingress Operator in OpenShift Container Platform.
4.4. External DNS Operator Copy linkLink copied to clipboard!
The External DNS Operator deploys and manages ExternalDNS to provide the name resolution for services and routes from the external DNS provider to OpenShift Container Platform. For more information, see Understanding the External DNS Operator.
4.5. Ingress Node Firewall Operator Copy linkLink copied to clipboard!
The Ingress Node Firewall Operator uses an extended Berkley Packet Filter (eBPF) and eXpress Data Path (XDP) plugin to process node firewall rules, update statistics and generate events for dropped traffic. The operator manages ingress node firewall resources, verifies firewall configuration, does not allow incorrectly configured rules that can prevent cluster access, and loads ingress node firewall XDP programs to the selected interfaces in the rule’s object(s). For more information, see Understanding the Ingress Node Firewall Operator
4.6. Network Observability Operator Copy linkLink copied to clipboard!
The Network Observability Operator is an optional Operator that allows cluster administrators to observe the network traffic for OpenShift Container Platform clusters. The Network Observability Operator uses the eBPF technology to create network flows. The network flows are then enriched with OpenShift Container Platform information and stored in Loki. You can view and analyze the stored network flows information in the OpenShift Container Platform console for further insight and troubleshooting. For more information, see About Network Observability Operator.
Chapter 5. Cluster Network Operator in OpenShift Container Platform Copy linkLink copied to clipboard!
You can use the Cluster Network Operator (CNO) to deploy and manage cluster network components on an OpenShift Container Platform cluster, including the Container Network Interface (CNI) network plugin selected for the cluster during installation.
5.1. Cluster Network Operator Copy linkLink copied to clipboard!
The Cluster Network Operator implements the network API from the operator.openshift.io API group. The Operator deploys the OVN-Kubernetes network plugin, or the network provider plugin that you selected during cluster installation, by using a daemon set.
Procedure
The Cluster Network Operator is deployed during installation as a Kubernetes Deployment.
Run the following command to view the Deployment status:
oc get -n openshift-network-operator deployment/network-operator
$ oc get -n openshift-network-operator deployment/network-operatorCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
NAME READY UP-TO-DATE AVAILABLE AGE network-operator 1/1 1 1 56m
NAME READY UP-TO-DATE AVAILABLE AGE network-operator 1/1 1 1 56mCopy to Clipboard Copied! Toggle word wrap Toggle overflow Run the following command to view the state of the Cluster Network Operator:
oc get clusteroperator/network
$ oc get clusteroperator/networkCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE network 4.5.4 True False False 50m
NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE network 4.5.4 True False False 50mCopy to Clipboard Copied! Toggle word wrap Toggle overflow The following fields provide information about the status of the operator:
AVAILABLE,PROGRESSING, andDEGRADED. TheAVAILABLEfield isTruewhen the Cluster Network Operator reports an available status condition.
5.2. Viewing the cluster network configuration Copy linkLink copied to clipboard!
Every new OpenShift Container Platform installation has a network.config object named cluster.
Procedure
Use the
oc describecommand to view the cluster network configuration:oc describe network.config/cluster
$ oc describe network.config/clusterCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
5.3. Viewing Cluster Network Operator status Copy linkLink copied to clipboard!
You can inspect the status and view the details of the Cluster Network Operator using the oc describe command.
Procedure
Run the following command to view the status of the Cluster Network Operator:
oc describe clusteroperators/network
$ oc describe clusteroperators/networkCopy to Clipboard Copied! Toggle word wrap Toggle overflow
5.4. Viewing Cluster Network Operator logs Copy linkLink copied to clipboard!
You can view Cluster Network Operator logs by using the oc logs command.
Procedure
Run the following command to view the logs of the Cluster Network Operator:
oc logs --namespace=openshift-network-operator deployment/network-operator
$ oc logs --namespace=openshift-network-operator deployment/network-operatorCopy to Clipboard Copied! Toggle word wrap Toggle overflow
5.5. Cluster Network Operator configuration Copy linkLink copied to clipboard!
The configuration for the cluster network is specified as part of the Cluster Network Operator (CNO) configuration and stored in a custom resource (CR) object that is named cluster. The CR specifies the fields for the Network API in the operator.openshift.io API group.
The CNO configuration inherits the following fields during cluster installation from the Network API in the Network.config.openshift.io API group and these fields cannot be changed:
clusterNetwork- IP address pools from which pod IP addresses are allocated.
serviceNetwork- IP address pool for services.
defaultNetwork.type- Cluster network plugin, such as OpenShift SDN or OVN-Kubernetes.
After cluster installation, you cannot modify the fields listed in the previous section.
You can specify the cluster network plugin configuration for your cluster by setting the fields for the defaultNetwork object in the CNO object named cluster.
5.5.1. Cluster Network Operator configuration object Copy linkLink copied to clipboard!
The fields for the Cluster Network Operator (CNO) are described in the following table:
| Field | Type | Description |
|---|---|---|
|
|
|
The name of the CNO object. This name is always |
|
|
| A list specifying the blocks of IP addresses from which pod IP addresses are allocated and the subnet prefix length assigned to each individual node in the cluster. For example:
This value is ready-only and inherited from the |
|
|
| A block of IP addresses for services. The OpenShift SDN and OVN-Kubernetes network plugins support only a single IP address block for the service network. For example: spec: serviceNetwork: - 172.30.0.0/14
This value is ready-only and inherited from the |
|
|
| Configures the network plugin for the cluster network. |
|
|
| The fields for this object specify the kube-proxy configuration. If you are using the OVN-Kubernetes cluster network plugin, the kube-proxy configuration has no effect. |
For a cluster that needs to deploy objects across multiple networks, ensure that you specify the same value for the clusterNetwork.hostPrefix parameter for each network type that is defined in the install-config.yaml file. Setting a different value for each clusterNetwork.hostPrefix parameter can impact the OVN-Kubernetes network plugin, where the plugin cannot effectively route object traffic among different nodes.
defaultNetwork object configuration
The values for the defaultNetwork object are defined in the following table:
| Field | Type | Description |
|---|---|---|
|
|
|
Either Note OpenShift Container Platform uses the OVN-Kubernetes network plugin by default. |
|
|
| This object is only valid for the OpenShift SDN network plugin. |
|
|
| This object is only valid for the OVN-Kubernetes network plugin. |
Configuration for the OpenShift SDN network plugin
The following table describes the configuration fields for the OpenShift SDN network plugin:
| Field | Type | Description |
|---|---|---|
|
|
| The network isolation mode for OpenShift SDN. |
|
|
| The maximum transmission unit (MTU) for the VXLAN overlay network. This value is normally configured automatically. |
|
|
|
The port to use for all VXLAN packets. The default value is |
Example OpenShift SDN configuration
Configuration for the OVN-Kubernetes network plugin
The following table describes the configuration fields for the OVN-Kubernetes network plugin:
| Field | Type | Description |
|---|---|---|
|
|
| The maximum transmission unit (MTU) for the Geneve (Generic Network Virtualization Encapsulation) overlay network. This value is normally configured automatically. |
|
|
| The UDP port for the Geneve overlay network. |
|
|
| If the field is present, IPsec is enabled for the cluster. |
|
|
| Specify a configuration object for customizing network policy audit logging. If unset, the defaults audit log settings are used. |
|
|
| Optional: Specify a configuration object for customizing how egress traffic is sent to the node gateway. Note While migrating egress traffic, you can expect some disruption to workloads and service traffic until the Cluster Network Operator (CNO) successfully rolls out the changes. |
|
|
If your existing network infrastructure overlaps with the This field cannot be changed after installation. |
The default value is |
|
|
If your existing network infrastructure overlaps with the This field cannot be changed after installation. |
The default value is |
| Field | Type | Description |
|---|---|---|
|
| integer |
The maximum number of messages to generate every second per node. The default value is |
|
| integer |
The maximum size for the audit log in bytes. The default value is |
|
| string | One of the following additional audit log targets:
|
|
| string |
The syslog facility, such as |
| Field | Type | Description |
|---|---|---|
|
|
|
Set this field to Note
In OpenShift Container Platform 4.12, egress IP is only assigned to the primary interface. Consequentially, setting
For highly-specialized installations and applications that rely on manually configured routes in the kernel routing table, you might want to route egress traffic to the host networking stack. By default, egress traffic is processed in OVN to exit the cluster and is not affected by specialized routes in the kernel routing table. The default value is
This field has an interaction with the Open vSwitch hardware offloading feature. If you set this field to |
You can only change the configuration for your cluster network plugin during cluster installation, except for the gatewayConfig field that can be changed at runtime as a postinstallation activity.
Example OVN-Kubernetes configuration with IPSec enabled
kubeProxyConfig object configuration
The values for the kubeProxyConfig object are defined in the following table:
| Field | Type | Description |
|---|---|---|
|
|
|
The refresh period for Note
Because of performance improvements introduced in OpenShift Container Platform 4.3 and greater, adjusting the |
|
|
|
The minimum duration before refreshing kubeProxyConfig:
proxyArguments:
iptables-min-sync-period:
- 0s
|
5.5.2. Cluster Network Operator example configuration Copy linkLink copied to clipboard!
A complete CNO configuration is specified in the following example:
Example Cluster Network Operator object
Chapter 6. DNS Operator in OpenShift Container Platform Copy linkLink copied to clipboard!
The DNS Operator deploys and manages CoreDNS to provide a name resolution service to pods, enabling DNS-based Kubernetes Service discovery in OpenShift Container Platform.
6.1. DNS Operator Copy linkLink copied to clipboard!
The DNS Operator implements the dns API from the operator.openshift.io API group. The Operator deploys CoreDNS using a daemon set, creates a service for the daemon set, and configures the kubelet to instruct pods to use the CoreDNS service IP address for name resolution.
Procedure
The DNS Operator is deployed during installation with a Deployment object.
Use the
oc getcommand to view the deployment status:oc get -n openshift-dns-operator deployment/dns-operator
$ oc get -n openshift-dns-operator deployment/dns-operatorCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
NAME READY UP-TO-DATE AVAILABLE AGE dns-operator 1/1 1 1 23h
NAME READY UP-TO-DATE AVAILABLE AGE dns-operator 1/1 1 1 23hCopy to Clipboard Copied! Toggle word wrap Toggle overflow Use the
oc getcommand to view the state of the DNS Operator:oc get clusteroperator/dns
$ oc get clusteroperator/dnsCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE dns 4.1.0-0.11 True False False 92m
NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE dns 4.1.0-0.11 True False False 92mCopy to Clipboard Copied! Toggle word wrap Toggle overflow AVAILABLE,PROGRESSINGandDEGRADEDprovide information about the status of the operator.AVAILABLEisTruewhen at least 1 pod from the CoreDNS daemon set reports anAvailablestatus condition.
6.2. Changing the DNS Operator managementState Copy linkLink copied to clipboard!
DNS manages the CoreDNS component to provide a name resolution service for pods and services in the cluster. The managementState of the DNS Operator is set to Managed by default, which means that the DNS Operator is actively managing its resources. You can change it to Unmanaged, which means the DNS Operator is not managing its resources.
The following are use cases for changing the DNS Operator managementState:
-
You are a developer and want to test a configuration change to see if it fixes an issue in CoreDNS. You can stop the DNS Operator from overwriting the fix by setting the
managementStatetoUnmanaged. -
You are a cluster administrator and have reported an issue with CoreDNS, but need to apply a workaround until the issue is fixed. You can set the
managementStatefield of the DNS Operator toUnmanagedto apply the workaround.
Procedure
Change
managementStateDNS Operator:oc patch dns.operator.openshift.io default --type merge --patch '{"spec":{"managementState":"Unmanaged"}}'oc patch dns.operator.openshift.io default --type merge --patch '{"spec":{"managementState":"Unmanaged"}}'Copy to Clipboard Copied! Toggle word wrap Toggle overflow
6.3. Controlling DNS pod placement Copy linkLink copied to clipboard!
The DNS Operator has two daemon sets: one for CoreDNS and one for managing the /etc/hosts file. The daemon set for /etc/hosts must run on every node host to add an entry for the cluster image registry to support pulling images. Security policies can prohibit communication between pairs of nodes, which prevents the daemon set for CoreDNS from running on every node.
As a cluster administrator, you can use a custom node selector to configure the daemon set for CoreDNS to run or not run on certain nodes.
Prerequisites
-
You installed the
ocCLI. -
You are logged in to the cluster with a user with
cluster-adminprivileges.
Procedure
To prevent communication between certain nodes, configure the
spec.nodePlacement.nodeSelectorAPI field:Modify the DNS Operator object named
default:oc edit dns.operator/default
$ oc edit dns.operator/defaultCopy to Clipboard Copied! Toggle word wrap Toggle overflow Specify a node selector that includes only control plane nodes in the
spec.nodePlacement.nodeSelectorAPI field:spec: nodePlacement: nodeSelector: node-role.kubernetes.io/worker: ""spec: nodePlacement: nodeSelector: node-role.kubernetes.io/worker: ""Copy to Clipboard Copied! Toggle word wrap Toggle overflow
To allow the daemon set for CoreDNS to run on nodes, configure a taint and toleration:
Modify the DNS Operator object named
default:oc edit dns.operator/default
$ oc edit dns.operator/defaultCopy to Clipboard Copied! Toggle word wrap Toggle overflow Specify a taint key and a toleration for the taint:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- If the taint is
dns-only, it can be tolerated indefinitely. You can omittolerationSeconds.
6.4. View the default DNS Copy linkLink copied to clipboard!
Every new OpenShift Container Platform installation has a dns.operator named default.
Procedure
Use the
oc describecommand to view the defaultdns:oc describe dns.operator/default
$ oc describe dns.operator/defaultCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow To find the service CIDR of your cluster, use the
oc getcommand:oc get networks.config/cluster -o jsonpath='{$.status.serviceNetwork}'$ oc get networks.config/cluster -o jsonpath='{$.status.serviceNetwork}'Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Example output
[172.30.0.0/16]
[172.30.0.0/16]
6.5. Using DNS forwarding Copy linkLink copied to clipboard!
You can use DNS forwarding to override the default forwarding configuration in the /etc/resolv.conf file in the following ways:
- Specify name servers for every zone. If the forwarded zone is the Ingress domain managed by OpenShift Container Platform, then the upstream name server must be authorized for the domain.
- Provide a list of upstream DNS servers.
- Change the default forwarding policy.
A DNS forwarding configuration for the default domain can have both the default servers specified in the /etc/resolv.conf file and the upstream DNS servers.
Procedure
Modify the DNS Operator object named
default:oc edit dns.operator/default
$ oc edit dns.operator/defaultCopy to Clipboard Copied! Toggle word wrap Toggle overflow After you issue the previous command, the Operator creates and updates the config map named
dns-defaultwith additional server configuration blocks based onServer. If none of the servers have a zone that matches the query, then name resolution falls back to the upstream DNS servers.Configuring DNS forwarding
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Must comply with the
rfc6335service name syntax. - 2
- Must conform to the definition of a subdomain in the
rfc1123service name syntax. The cluster domain,cluster.local, is an invalid subdomain for thezonesfield. - 3
- Defines the policy to select upstream resolvers. Default value is
Random. You can also use the valuesRoundRobin, andSequential. - 4
- A maximum of 15
upstreamsis allowed perforwardPlugin. - 5
- Optional. You can use it to override the default policy and forward DNS resolution to the specified DNS resolvers (upstream resolvers) for the default domain. If you do not provide any upstream resolvers, the DNS name queries go to the servers in
/etc/resolv.conf. - 6
- Determines the order in which upstream servers are selected for querying. You can specify one of these values:
Random,RoundRobin, orSequential. The default value isSequential. - 7
- Optional. You can use it to provide upstream resolvers.
- 8
- You can specify two types of
upstreams-SystemResolvConfandNetwork.SystemResolvConfconfigures the upstream to use/etc/resolv.confandNetworkdefines aNetworkresolver. You can specify one or both. - 9
- If the specified type is
Network, you must provide an IP address. Theaddressfield must be a valid IPv4 or IPv6 address. - 10
- If the specified type is
Network, you can optionally provide a port. Theportfield must have a value between1and65535. If you do not specify a port for the upstream, by default port 853 is tried.
Optional: When working in a highly regulated environment, you might need the ability to secure DNS traffic when forwarding requests to upstream resolvers so that you can ensure additional DNS traffic and data privacy. Cluster administrators can configure transport layer security (TLS) for forwarded DNS queries.
Configuring DNS forwarding with TLS
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Must comply with the
rfc6335service name syntax. - 2
- Must conform to the definition of a subdomain in the
rfc1123service name syntax. The cluster domain,cluster.local, is an invalid subdomain for thezonesfield. The cluster domain,cluster.local, is an invalidsubdomainforzones. - 3
- When configuring TLS for forwarded DNS queries, set the
transportfield to have the valueTLS. By default, CoreDNS caches forwarded connections for 10 seconds. CoreDNS will hold a TCP connection open for those 10 seconds if no request is issued. With large clusters, ensure that your DNS server is aware that it might get many new connections to hold open because you can initiate a connection per node. Set up your DNS hierarchy accordingly to avoid performance issues. - 4
- When configuring TLS for forwarded DNS queries, this is a mandatory server name used as part of the server name indication (SNI) to validate the upstream TLS server certificate.
- 5
- Defines the policy to select upstream resolvers. Default value is
Random. You can also use the valuesRoundRobin, andSequential. - 6
- Required. You can use it to provide upstream resolvers. A maximum of 15
upstreamsentries are allowed perforwardPluginentry. - 7
- Optional. You can use it to override the default policy and forward DNS resolution to the specified DNS resolvers (upstream resolvers) for the default domain. If you do not provide any upstream resolvers, the DNS name queries go to the servers in
/etc/resolv.conf. - 8
Networktype indicates that this upstream resolver should handle forwarded requests separately from the upstream resolvers listed in/etc/resolv.conf. Only theNetworktype is allowed when using TLS and you must provide an IP address.- 9
- The
addressfield must be a valid IPv4 or IPv6 address. - 10
- You can optionally provide a port. The
portmust have a value between1and65535. If you do not specify a port for the upstream, by default port 853 is tried.
NoteIf
serversis undefined or invalid, the config map only contains the default server.
Verification
View the config map:
oc get configmap/dns-default -n openshift-dns -o yaml
$ oc get configmap/dns-default -n openshift-dns -o yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Sample DNS ConfigMap based on previous sample DNS
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Changes to the
forwardPlugintriggers a rolling update of the CoreDNS daemon set.
6.6. DNS Operator status Copy linkLink copied to clipboard!
You can inspect the status and view the details of the DNS Operator using the oc describe command.
Procedure
View the status of the DNS Operator:
oc describe clusteroperators/dns
$ oc describe clusteroperators/dns
6.7. DNS Operator logs Copy linkLink copied to clipboard!
You can view DNS Operator logs by using the oc logs command.
Procedure
View the logs of the DNS Operator:
oc logs -n openshift-dns-operator deployment/dns-operator -c dns-operator
$ oc logs -n openshift-dns-operator deployment/dns-operator -c dns-operator
6.8. Setting the CoreDNS log level Copy linkLink copied to clipboard!
You can configure the CoreDNS log level to determine the amount of detail in logged error messages. The valid values for CoreDNS log level are Normal, Debug, and Trace. The default logLevel is Normal.
The errors plugin is always enabled. The following logLevel settings report different error responses:
-
logLevel:Normalenables the "errors" class:log . { class error }. -
logLevel:Debugenables the "denial" class:log . { class denial error }. -
logLevel:Traceenables the "all" class:log . { class all }.
Procedure
To set
logLeveltoDebug, enter the following command:oc patch dnses.operator.openshift.io/default -p '{"spec":{"logLevel":"Debug"}}' --type=merge$ oc patch dnses.operator.openshift.io/default -p '{"spec":{"logLevel":"Debug"}}' --type=mergeCopy to Clipboard Copied! Toggle word wrap Toggle overflow To set
logLeveltoTrace, enter the following command:oc patch dnses.operator.openshift.io/default -p '{"spec":{"logLevel":"Trace"}}' --type=merge$ oc patch dnses.operator.openshift.io/default -p '{"spec":{"logLevel":"Trace"}}' --type=mergeCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Verification
To ensure the desired log level was set, check the config map:
oc get configmap/dns-default -n openshift-dns -o yaml
$ oc get configmap/dns-default -n openshift-dns -o yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
6.9. Viewing the CoreDNS logs Copy linkLink copied to clipboard!
You can view CoreDNS logs by using the oc logs command.
Procedure
View the logs of a specific CoreDNS pod by entering the following command:
oc -n openshift-dns logs -c dns <core_dns_pod_name>
$ oc -n openshift-dns logs -c dns <core_dns_pod_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Follow the logs of all CoreDNS pods by entering the following command:
oc -n openshift-dns logs -c dns -l dns.operator.openshift.io/daemonset-dns=default -f --max-log-requests=<number>
$ oc -n openshift-dns logs -c dns -l dns.operator.openshift.io/daemonset-dns=default -f --max-log-requests=<number>1 Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Specifies the number of DNS pods to stream logs from. The maximum is 6.
6.10. Setting the CoreDNS Operator log level Copy linkLink copied to clipboard!
Cluster administrators can configure the Operator log level to more quickly track down OpenShift DNS issues. The valid values for operatorLogLevel are Normal, Debug, and Trace. Trace has the most detailed information. The default operatorlogLevel is Normal. There are seven logging levels for issues: Trace, Debug, Info, Warning, Error, Fatal and Panic. After the logging level is set, log entries with that severity or anything above it will be logged.
-
operatorLogLevel: "Normal"setslogrus.SetLogLevel("Info"). -
operatorLogLevel: "Debug"setslogrus.SetLogLevel("Debug"). -
operatorLogLevel: "Trace"setslogrus.SetLogLevel("Trace").
Procedure
To set
operatorLogLeveltoDebug, enter the following command:oc patch dnses.operator.openshift.io/default -p '{"spec":{"operatorLogLevel":"Debug"}}' --type=merge$ oc patch dnses.operator.openshift.io/default -p '{"spec":{"operatorLogLevel":"Debug"}}' --type=mergeCopy to Clipboard Copied! Toggle word wrap Toggle overflow To set
operatorLogLeveltoTrace, enter the following command:oc patch dnses.operator.openshift.io/default -p '{"spec":{"operatorLogLevel":"Trace"}}' --type=merge$ oc patch dnses.operator.openshift.io/default -p '{"spec":{"operatorLogLevel":"Trace"}}' --type=mergeCopy to Clipboard Copied! Toggle word wrap Toggle overflow
6.11. Tuning the CoreDNS cache Copy linkLink copied to clipboard!
You can configure the maximum duration of both successful or unsuccessful caching, also known as positive or negative caching respectively, done by CoreDNS. Tuning the duration of caching of DNS query responses can reduce the load for any upstream DNS resolvers.
Procedure
Edit the DNS Operator object named
defaultby running the following command:oc edit dns.operator.openshift.io/default
$ oc edit dns.operator.openshift.io/defaultCopy to Clipboard Copied! Toggle word wrap Toggle overflow Modify the time-to-live (TTL) caching values:
Configuring DNS caching
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- The string value
1his converted to its respective number of seconds by CoreDNS. If this field is omitted, the value is assumed to be0sand the cluster uses the internal default value of900sas a fallback. - 2
- The string value can be a combination of units such as
0.5h10mand is converted to its respective number of seconds by CoreDNS. If this field is omitted, the value is assumed to be0sand the cluster uses the internal default value of30sas a fallback.
WarningSetting TTL fields to low values could lead to an increased load on the cluster, any upstream resolvers, or both.
Chapter 7. Ingress Operator in OpenShift Container Platform Copy linkLink copied to clipboard!
7.1. OpenShift Container Platform Ingress Operator Copy linkLink copied to clipboard!
When you create your OpenShift Container Platform cluster, pods and services running on the cluster are each allocated their own IP addresses. The IP addresses are accessible to other pods and services running nearby but are not accessible to outside clients. The Ingress Operator implements the IngressController API and is the component responsible for enabling external access to OpenShift Container Platform cluster services.
The Ingress Operator makes it possible for external clients to access your service by deploying and managing one or more HAProxy-based Ingress Controllers to handle routing. You can use the Ingress Operator to route traffic by specifying OpenShift Container Platform Route and Kubernetes Ingress resources. Configurations within the Ingress Controller, such as the ability to define endpointPublishingStrategy type and internal load balancing, provide ways to publish Ingress Controller endpoints.
7.2. The Ingress configuration asset Copy linkLink copied to clipboard!
The installation program generates an asset with an Ingress resource in the config.openshift.io API group, cluster-ingress-02-config.yml.
YAML Definition of the Ingress resource
The installation program stores this asset in the cluster-ingress-02-config.yml file in the manifests/ directory. This Ingress resource defines the cluster-wide configuration for Ingress. This Ingress configuration is used as follows:
- The Ingress Operator uses the domain from the cluster Ingress configuration as the domain for the default Ingress Controller.
-
The OpenShift API Server Operator uses the domain from the cluster Ingress configuration. This domain is also used when generating a default host for a
Routeresource that does not specify an explicit host.
7.3. Ingress Controller configuration parameters Copy linkLink copied to clipboard!
The IngressController custom resource (CR) includes optional configuration parameters that you can configure to meet specific needs for your organization.
| Parameter | Description |
|---|---|
|
|
The
If empty, the default value is |
|
|
|
|
|
For cloud environments, use the
On Google Cloud, AWS, and Azure you can configure the following
If not set, the default value is based on
For most platforms, the
For non-cloud environments, such as a bare-metal platform, use the
If you do not set a value in one of these fields, the default value is based on binding ports specified in the
If you need to update the
|
|
|
The
The secret must contain the following keys and data: *
If not set, a wildcard certificate is automatically generated and used. The certificate is valid for the Ingress Controller The in-use certificate, whether generated or user-specified, is automatically integrated with OpenShift Container Platform built-in OAuth server. |
|
|
|
|
|
|
|
|
If not set, the defaults values are used. Note
The |
|
|
If not set, the default value is based on the
When using the
The minimum TLS version for Ingress Controllers is Note
Ciphers and the minimum TLS version of the configured security profile are reflected in the Important
The Ingress Operator converts the TLS |
|
|
The
The |
|
|
|
|
|
|
|
|
By setting the
By default, the policy is set to
By setting These adjustments are only applied to cleartext, edge-terminated, and re-encrypt routes, and only when using HTTP/1.
For request headers, these adjustments are applied only for routes that have the |
|
|
|
|
|
|
|
|
For any cookie that you want to capture, the following parameters must be in your
For example: httpCaptureCookies:
- matchType: Exact
maxLength: 128
name: MYCOOKIE
|
|
|
|
|
|
|
|
|
The
|
|
|
The
These connections come from load balancer health probes or web browser speculative connections (preconnect) and can be safely ignored. However, these requests can be caused by network errors, so setting this field to |
7.3.1. Ingress Controller TLS security profiles Copy linkLink copied to clipboard!
TLS security profiles provide a way for servers to regulate which ciphers a connecting client can use when connecting to the server.
7.3.1.1. Understanding TLS security profiles Copy linkLink copied to clipboard!
You can use a TLS (Transport Layer Security) security profile to define which TLS ciphers are required by various OpenShift Container Platform components. The OpenShift Container Platform TLS security profiles are based on Mozilla recommended configurations.
You can specify one of the following TLS security profiles for each component:
| Profile | Description |
|---|---|
|
| This profile is intended for use with legacy clients or libraries. The profile is based on the Old backward compatibility recommended configuration.
The Note For the Ingress Controller, the minimum TLS version is converted from 1.0 to 1.1. |
|
| This profile is the default TLS security profile for the Ingress Controller, kubelet, and control plane. The profile is based on the Intermediate compatibility recommended configuration.
The Note This profile is the recommended configuration for the majority of clients. |
|
| This profile is intended for use with modern clients that have no need for backwards compatibility. This profile is based on the Modern compatibility recommended configuration.
The |
|
| This profile allows you to define the TLS version and ciphers to use. Warning
Use caution when using a |
When using one of the predefined profile types, the effective profile configuration is subject to change between releases. For example, given a specification to use the Intermediate profile deployed on release X.Y.Z, an upgrade to release X.Y.Z+1 might cause a new profile configuration to be applied, resulting in a rollout.
7.3.1.2. Configuring the TLS security profile for the Ingress Controller Copy linkLink copied to clipboard!
To configure a TLS security profile for an Ingress Controller, edit the IngressController custom resource (CR) to specify a predefined or custom TLS security profile. If a TLS security profile is not configured, the default value is based on the TLS security profile set for the API server.
Sample IngressController CR that configures the Old TLS security profile
The TLS security profile defines the minimum TLS version and the TLS ciphers for TLS connections for Ingress Controllers.
You can see the ciphers and the minimum TLS version of the configured TLS security profile in the IngressController custom resource (CR) under Status.Tls Profile and the configured TLS security profile under Spec.Tls Security Profile. For the Custom TLS security profile, the specific ciphers and minimum TLS version are listed under both parameters.
The HAProxy Ingress Controller image supports TLS 1.3 and the Modern profile.
The Ingress Operator also converts the TLS 1.0 of an Old or Custom profile to 1.1.
Prerequisites
-
You have access to the cluster as a user with the
cluster-adminrole.
Procedure
Edit the
IngressControllerCR in theopenshift-ingress-operatorproject to configure the TLS security profile:oc edit IngressController default -n openshift-ingress-operator
$ oc edit IngressController default -n openshift-ingress-operatorCopy to Clipboard Copied! Toggle word wrap Toggle overflow Add the
spec.tlsSecurityProfilefield:Sample
IngressControllerCR for aCustomprofileCopy to Clipboard Copied! Toggle word wrap Toggle overflow - Save the file to apply the changes.
Verification
Verify that the profile is set in the
IngressControllerCR:oc describe IngressController default -n openshift-ingress-operator
$ oc describe IngressController default -n openshift-ingress-operatorCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
7.3.1.3. Configuring mutual TLS authentication Copy linkLink copied to clipboard!
You can configure the Ingress Controller to enable mutual TLS (mTLS) authentication by setting a spec.clientTLS value. The clientTLS value configures the Ingress Controller to verify client certificates. This configuration includes setting a clientCA value, which is a reference to a config map. The config map contains the PEM-encoded CA certificate bundle that is used to verify a client’s certificate. Optionally, you can also configure a list of certificate subject filters.
If the clientCA value specifies an X509v3 certificate revocation list (CRL) distribution point, the Ingress Operator downloads and manages a CRL config map based on the HTTP URI X509v3 CRL Distribution Point specified in each provided certificate. The Ingress Controller uses this config map during mTLS/TLS negotiation. Requests that do not provide valid certificates are rejected.
Prerequisites
-
You have access to the cluster as a user with the
cluster-adminrole. - You have a PEM-encoded CA certificate bundle.
If your CA bundle references a CRL distribution point, you must have also included the end-entity or leaf certificate to the client CA bundle. This certificate must have included an HTTP URI under
CRL Distribution Points, as described in RFC 5280. For example:Issuer: C=US, O=Example Inc, CN=Example Global G2 TLS RSA SHA256 2020 CA1 Subject: SOME SIGNED CERT X509v3 CRL Distribution Points: Full Name: URI:http://crl.example.com/example.crlIssuer: C=US, O=Example Inc, CN=Example Global G2 TLS RSA SHA256 2020 CA1 Subject: SOME SIGNED CERT X509v3 CRL Distribution Points: Full Name: URI:http://crl.example.com/example.crlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Procedure
In the
openshift-confignamespace, create a config map from your CA bundle:oc create configmap \ router-ca-certs-default \ --from-file=ca-bundle.pem=client-ca.crt \ -n openshift-config
$ oc create configmap \ router-ca-certs-default \ --from-file=ca-bundle.pem=client-ca.crt \1 -n openshift-configCopy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- The config map data key must be
ca-bundle.pem, and the data value must be a CA certificate in PEM format.
Edit the
IngressControllerresource in theopenshift-ingress-operatorproject:oc edit IngressController default -n openshift-ingress-operator
$ oc edit IngressController default -n openshift-ingress-operatorCopy to Clipboard Copied! Toggle word wrap Toggle overflow Add the
spec.clientTLSfield and subfields to configure mutual TLS:Sample
IngressControllerCR for aclientTLSprofile that specifies filtering patternsCopy to Clipboard Copied! Toggle word wrap Toggle overflow -
Optional, get the Distinguished Name (DN) for
allowedSubjectPatternsby entering the following command.
openssl x509 -in custom-cert.pem -noout -subject subject= /CN=example.com/ST=NC/C=US/O=Security/OU=OpenShift
$ openssl x509 -in custom-cert.pem -noout -subject
subject= /CN=example.com/ST=NC/C=US/O=Security/OU=OpenShift
7.4. View the default Ingress Controller Copy linkLink copied to clipboard!
The Ingress Operator is a core feature of OpenShift Container Platform and is enabled out of the box.
Every new OpenShift Container Platform installation has an ingresscontroller named default. It can be supplemented with additional Ingress Controllers. If the default ingresscontroller is deleted, the Ingress Operator will automatically recreate it within a minute.
Procedure
View the default Ingress Controller:
oc describe --namespace=openshift-ingress-operator ingresscontroller/default
$ oc describe --namespace=openshift-ingress-operator ingresscontroller/defaultCopy to Clipboard Copied! Toggle word wrap Toggle overflow
7.5. View Ingress Operator status Copy linkLink copied to clipboard!
You can view and inspect the status of your Ingress Operator.
Procedure
View your Ingress Operator status:
oc describe clusteroperators/ingress
$ oc describe clusteroperators/ingressCopy to Clipboard Copied! Toggle word wrap Toggle overflow
7.6. View Ingress Controller logs Copy linkLink copied to clipboard!
You can view your Ingress Controller logs.
Procedure
View your Ingress Controller logs:
oc logs --namespace=openshift-ingress-operator deployments/ingress-operator -c <container_name>
$ oc logs --namespace=openshift-ingress-operator deployments/ingress-operator -c <container_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow
7.7. View Ingress Controller status Copy linkLink copied to clipboard!
Your can view the status of a particular Ingress Controller.
Procedure
View the status of an Ingress Controller:
oc describe --namespace=openshift-ingress-operator ingresscontroller/<name>
$ oc describe --namespace=openshift-ingress-operator ingresscontroller/<name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow
7.8. Configuring the Ingress Controller Copy linkLink copied to clipboard!
7.8.1. Setting a custom default certificate Copy linkLink copied to clipboard!
As an administrator, you can configure an Ingress Controller to use a custom certificate by creating a Secret resource and editing the IngressController custom resource (CR).
Prerequisites
- You must have a certificate/key pair in PEM-encoded files, where the certificate is signed by a trusted certificate authority or by a private trusted certificate authority that you configured in a custom PKI.
Your certificate meets the following requirements:
- The certificate is valid for the ingress domain.
-
The certificate uses the
subjectAltNameextension to specify a wildcard domain, such as*.apps.ocp4.example.com.
You must have an
IngressControllerCR. You may use the default one:oc --namespace openshift-ingress-operator get ingresscontrollers
$ oc --namespace openshift-ingress-operator get ingresscontrollersCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
NAME AGE default 10m
NAME AGE default 10mCopy to Clipboard Copied! Toggle word wrap Toggle overflow
If you have intermediate certificates, they must be included in the tls.crt file of the secret containing a custom default certificate. Order matters when specifying a certificate; list your intermediate certificate(s) after any server certificate(s).
Procedure
The following assumes that the custom certificate and key pair are in the tls.crt and tls.key files in the current working directory. Substitute the actual path names for tls.crt and tls.key. You also may substitute another name for custom-certs-default when creating the Secret resource and referencing it in the IngressController CR.
This action will cause the Ingress Controller to be redeployed, using a rolling deployment strategy.
Create a Secret resource containing the custom certificate in the
openshift-ingressnamespace using thetls.crtandtls.keyfiles.oc --namespace openshift-ingress create secret tls custom-certs-default --cert=tls.crt --key=tls.key
$ oc --namespace openshift-ingress create secret tls custom-certs-default --cert=tls.crt --key=tls.keyCopy to Clipboard Copied! Toggle word wrap Toggle overflow Update the IngressController CR to reference the new certificate secret:
oc patch --type=merge --namespace openshift-ingress-operator ingresscontrollers/default \ --patch '{"spec":{"defaultCertificate":{"name":"custom-certs-default"}}}'$ oc patch --type=merge --namespace openshift-ingress-operator ingresscontrollers/default \ --patch '{"spec":{"defaultCertificate":{"name":"custom-certs-default"}}}'Copy to Clipboard Copied! Toggle word wrap Toggle overflow Verify the update was effective:
echo Q |\ openssl s_client -connect console-openshift-console.apps.<domain>:443 -showcerts 2>/dev/null |\ openssl x509 -noout -subject -issuer -enddate
$ echo Q |\ openssl s_client -connect console-openshift-console.apps.<domain>:443 -showcerts 2>/dev/null |\ openssl x509 -noout -subject -issuer -enddateCopy to Clipboard Copied! Toggle word wrap Toggle overflow where:
<domain>- Specifies the base domain name for your cluster.
Example output
subject=C = US, ST = NC, L = Raleigh, O = RH, OU = OCP4, CN = *.apps.example.com issuer=C = US, ST = NC, L = Raleigh, O = RH, OU = OCP4, CN = example.com notAfter=May 10 08:32:45 2022 GM
subject=C = US, ST = NC, L = Raleigh, O = RH, OU = OCP4, CN = *.apps.example.com issuer=C = US, ST = NC, L = Raleigh, O = RH, OU = OCP4, CN = example.com notAfter=May 10 08:32:45 2022 GMCopy to Clipboard Copied! Toggle word wrap Toggle overflow TipYou can alternatively apply the following YAML to set a custom default certificate:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow The certificate secret name should match the value used to update the CR.
Once the IngressController CR has been modified, the Ingress Operator updates the Ingress Controller’s deployment to use the custom certificate.
7.8.2. Removing a custom default certificate Copy linkLink copied to clipboard!
As an administrator, you can remove a custom certificate that you configured an Ingress Controller to use.
Prerequisites
-
You have access to the cluster as a user with the
cluster-adminrole. -
You have installed the OpenShift CLI (
oc). - You previously configured a custom default certificate for the Ingress Controller.
Procedure
To remove the custom certificate and restore the certificate that ships with OpenShift Container Platform, enter the following command:
oc patch -n openshift-ingress-operator ingresscontrollers/default \ --type json -p $'- op: remove\n path: /spec/defaultCertificate'
$ oc patch -n openshift-ingress-operator ingresscontrollers/default \ --type json -p $'- op: remove\n path: /spec/defaultCertificate'Copy to Clipboard Copied! Toggle word wrap Toggle overflow There can be a delay while the cluster reconciles the new certificate configuration.
Verification
To confirm that the original cluster certificate is restored, enter the following command:
echo Q | \ openssl s_client -connect console-openshift-console.apps.<domain>:443 -showcerts 2>/dev/null | \ openssl x509 -noout -subject -issuer -enddate
$ echo Q | \ openssl s_client -connect console-openshift-console.apps.<domain>:443 -showcerts 2>/dev/null | \ openssl x509 -noout -subject -issuer -enddateCopy to Clipboard Copied! Toggle word wrap Toggle overflow where:
<domain>- Specifies the base domain name for your cluster.
Example output
subject=CN = *.apps.<domain> issuer=CN = ingress-operator@1620633373 notAfter=May 10 10:44:36 2023 GMT
subject=CN = *.apps.<domain> issuer=CN = ingress-operator@1620633373 notAfter=May 10 10:44:36 2023 GMTCopy to Clipboard Copied! Toggle word wrap Toggle overflow
7.8.3. Autoscaling an Ingress Controller Copy linkLink copied to clipboard!
You can automatically scale an Ingress Controller to dynamically meet routing performance or availability requirements, such as the requirement to increase throughput.
The following procedure provides an example for scaling up the default Ingress Controller.
Prerequisites
-
You have the OpenShift CLI (
oc) installed. -
You have access to an OpenShift Container Platform cluster as a user with the
cluster-adminrole. You installed the Custom Metrics Autoscaler Operator and an associated KEDA Controller.
-
You can install the Operator by using OperatorHub on the web console. After you install the Operator, you can create an instance of
KedaController.
-
You can install the Operator by using OperatorHub on the web console. After you install the Operator, you can create an instance of
Procedure
Create a service account to authenticate with Thanos by running the following command:
oc create -n openshift-ingress-operator serviceaccount thanos && oc describe -n openshift-ingress-operator serviceaccount thanos
$ oc create -n openshift-ingress-operator serviceaccount thanos && oc describe -n openshift-ingress-operator serviceaccount thanosCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Manually create the service account secret token with the following command:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Define a
TriggerAuthenticationobject within theopenshift-ingress-operatornamespace by using the service account’s token.Define the
secretvariable that contains the secret by running the following command:secret=$(oc get secret -n openshift-ingress-operator | grep thanos-token | head -n 1 | awk '{ print $1 }')$ secret=$(oc get secret -n openshift-ingress-operator | grep thanos-token | head -n 1 | awk '{ print $1 }')Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create the
TriggerAuthenticationobject and pass the value of thesecretvariable to theTOKENparameter:Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Create and apply a role for reading metrics from Thanos:
Create a new role,
thanos-metrics-reader.yaml, that reads metrics from pods and nodes:thanos-metrics-reader.yaml
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Apply the new role by running the following command:
oc apply -f thanos-metrics-reader.yaml
$ oc apply -f thanos-metrics-reader.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Add the new role to the service account by entering the following commands:
oc adm policy -n openshift-ingress-operator add-role-to-user thanos-metrics-reader -z thanos --role-namespace=openshift-ingress-operator
$ oc adm policy -n openshift-ingress-operator add-role-to-user thanos-metrics-reader -z thanos --role-namespace=openshift-ingress-operatorCopy to Clipboard Copied! Toggle word wrap Toggle overflow oc adm policy -n openshift-ingress-operator add-cluster-role-to-user cluster-monitoring-view -z thanos
$ oc adm policy -n openshift-ingress-operator add-cluster-role-to-user cluster-monitoring-view -z thanosCopy to Clipboard Copied! Toggle word wrap Toggle overflow NoteThe argument
add-cluster-role-to-useris only required if you use cross-namespace queries. The following step uses a query from thekube-metricsnamespace which requires this argument.Create a new
ScaledObjectYAML file,ingress-autoscaler.yaml, that targets the default Ingress Controller deployment:Example
ScaledObjectdefinitionCopy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- The custom resource that you are targeting. In this case, the Ingress Controller.
- 2
- Optional: The maximum number of replicas. If you omit this field, the default maximum is set to 100 replicas.
- 3
- The Thanos service endpoint in the
openshift-monitoringnamespace. - 4
- The Ingress Operator namespace.
- 5
- This expression evaluates to however many worker nodes are present in the deployed cluster.
ImportantIf you are using cross-namespace queries, you must target port 9091 and not port 9092 in the
serverAddressfield. You also must have elevated privileges to read metrics from this port.Apply the custom resource definition by running the following command:
oc apply -f ingress-autoscaler.yaml
$ oc apply -f ingress-autoscaler.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Verification
Verify that the default Ingress Controller is scaled out to match the value returned by the
kube-state-metricsquery by running the following commands:Use the
grepcommand to search the Ingress Controller YAML file for replicas:oc get -n openshift-ingress-operator ingresscontroller/default -o yaml | grep replicas:
$ oc get -n openshift-ingress-operator ingresscontroller/default -o yaml | grep replicas:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
replicas: 3
replicas: 3Copy to Clipboard Copied! Toggle word wrap Toggle overflow Get the pods in the
openshift-ingressproject:oc get pods -n openshift-ingress
$ oc get pods -n openshift-ingressCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
NAME READY STATUS RESTARTS AGE router-default-7b5df44ff-l9pmm 2/2 Running 0 17h router-default-7b5df44ff-s5sl5 2/2 Running 0 3d22h router-default-7b5df44ff-wwsth 2/2 Running 0 66s
NAME READY STATUS RESTARTS AGE router-default-7b5df44ff-l9pmm 2/2 Running 0 17h router-default-7b5df44ff-s5sl5 2/2 Running 0 3d22h router-default-7b5df44ff-wwsth 2/2 Running 0 66sCopy to Clipboard Copied! Toggle word wrap Toggle overflow
7.8.4. Scaling an Ingress Controller Copy linkLink copied to clipboard!
Manually scale an Ingress Controller to meeting routing performance or availability requirements such as the requirement to increase throughput. oc commands are used to scale the IngressController resource. The following procedure provides an example for scaling up the default IngressController.
Scaling is not an immediate action, as it takes time to create the desired number of replicas.
Procedure
View the current number of available replicas for the default
IngressController:oc get -n openshift-ingress-operator ingresscontrollers/default -o jsonpath='{$.status.availableReplicas}'$ oc get -n openshift-ingress-operator ingresscontrollers/default -o jsonpath='{$.status.availableReplicas}'Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
2
2Copy to Clipboard Copied! Toggle word wrap Toggle overflow Scale the default
IngressControllerto the desired number of replicas using theoc patchcommand. The following example scales the defaultIngressControllerto 3 replicas:oc patch -n openshift-ingress-operator ingresscontroller/default --patch '{"spec":{"replicas": 3}}' --type=merge$ oc patch -n openshift-ingress-operator ingresscontroller/default --patch '{"spec":{"replicas": 3}}' --type=mergeCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
ingresscontroller.operator.openshift.io/default patched
ingresscontroller.operator.openshift.io/default patchedCopy to Clipboard Copied! Toggle word wrap Toggle overflow Verify that the default
IngressControllerscaled to the number of replicas that you specified:oc get -n openshift-ingress-operator ingresscontrollers/default -o jsonpath='{$.status.availableReplicas}'$ oc get -n openshift-ingress-operator ingresscontrollers/default -o jsonpath='{$.status.availableReplicas}'Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
3
3Copy to Clipboard Copied! Toggle word wrap Toggle overflow TipYou can alternatively apply the following YAML to scale an Ingress Controller to three replicas:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- If you need a different amount of replicas, change the
replicasvalue.
7.8.5. Configuring Ingress access logging Copy linkLink copied to clipboard!
You can configure the Ingress Controller to enable access logs. If you have clusters that do not receive much traffic, then you can log to a sidecar. If you have high traffic clusters, to avoid exceeding the capacity of the logging stack or to integrate with a logging infrastructure outside of OpenShift Container Platform, you can forward logs to a custom syslog endpoint. You can also specify the format for access logs.
Container logging is useful to enable access logs on low-traffic clusters when there is no existing Syslog logging infrastructure, or for short-term use while diagnosing problems with the Ingress Controller.
Syslog is needed for high-traffic clusters where access logs could exceed the OpenShift Logging stack’s capacity, or for environments where any logging solution needs to integrate with an existing Syslog logging infrastructure. The Syslog use-cases can overlap.
Prerequisites
-
Log in as a user with
cluster-adminprivileges.
Procedure
Configure Ingress access logging to a sidecar.
To configure Ingress access logging, you must specify a destination using
spec.logging.access.destination. To specify logging to a sidecar container, you must specifyContainerspec.logging.access.destination.type. The following example is an Ingress Controller definition that logs to aContainerdestination:Copy to Clipboard Copied! Toggle word wrap Toggle overflow When you configure the Ingress Controller to log to a sidecar, the operator creates a container named
logsinside the Ingress Controller Pod:oc -n openshift-ingress logs deployment.apps/router-default -c logs
$ oc -n openshift-ingress logs deployment.apps/router-default -c logsCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
2020-05-11T19:11:50.135710+00:00 router-default-57dfc6cd95-bpmk6 router-default-57dfc6cd95-bpmk6 haproxy[108]: 174.19.21.82:39654 [11/May/2020:19:11:50.133] public be_http:hello-openshift:hello-openshift/pod:hello-openshift:hello-openshift:10.128.2.12:8080 0/0/1/0/1 200 142 - - --NI 1/1/0/0/0 0/0 "GET / HTTP/1.1"
2020-05-11T19:11:50.135710+00:00 router-default-57dfc6cd95-bpmk6 router-default-57dfc6cd95-bpmk6 haproxy[108]: 174.19.21.82:39654 [11/May/2020:19:11:50.133] public be_http:hello-openshift:hello-openshift/pod:hello-openshift:hello-openshift:10.128.2.12:8080 0/0/1/0/1 200 142 - - --NI 1/1/0/0/0 0/0 "GET / HTTP/1.1"Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Configure Ingress access logging to a Syslog endpoint.
To configure Ingress access logging, you must specify a destination using
spec.logging.access.destination. To specify logging to a Syslog endpoint destination, you must specifySyslogforspec.logging.access.destination.type. If the destination type isSyslog, you must also specify a destination endpoint usingspec.logging.access.destination.syslog.endpointand you can specify a facility usingspec.logging.access.destination.syslog.facility. The following example is an Ingress Controller definition that logs to aSyslogdestination:Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteThe
syslogdestination port must be UDP.
Configure Ingress access logging with a specific log format.
You can specify
spec.logging.access.httpLogFormatto customize the log format. The following example is an Ingress Controller definition that logs to asyslogendpoint with IP address 1.2.3.4 and port 10514:Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Disable Ingress access logging.
To disable Ingress access logging, leave
spec.loggingorspec.logging.accessempty:Copy to Clipboard Copied! Toggle word wrap Toggle overflow
7.8.6. Setting Ingress Controller thread count Copy linkLink copied to clipboard!
A cluster administrator can set the thread count to increase the amount of incoming connections a cluster can handle. You can patch an existing Ingress Controller to increase the amount of threads.
Prerequisites
- The following assumes that you already created an Ingress Controller.
Procedure
Update the Ingress Controller to increase the number of threads:
oc -n openshift-ingress-operator patch ingresscontroller/default --type=merge -p '{"spec":{"tuningOptions": {"threadCount": 8}}}'$ oc -n openshift-ingress-operator patch ingresscontroller/default --type=merge -p '{"spec":{"tuningOptions": {"threadCount": 8}}}'Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteIf you have a node that is capable of running large amounts of resources, you can configure
spec.nodePlacement.nodeSelectorwith labels that match the capacity of the intended node, and configurespec.tuningOptions.threadCountto an appropriately high value.
7.8.7. Configuring an Ingress Controller to use an internal load balancer Copy linkLink copied to clipboard!
When creating an Ingress Controller on cloud platforms, the Ingress Controller is published by a public cloud load balancer by default. As an administrator, you can create an Ingress Controller that uses an internal cloud load balancer.
If your cloud provider is Microsoft Azure, you must have at least one public load balancer that points to your nodes. If you do not, all of your nodes will lose egress connectivity to the internet.
If you want to change the scope for an IngressController, you can change the .spec.endpointPublishingStrategy.loadBalancer.scope parameter after the custom resource (CR) is created.
Figure 7.1. Diagram of LoadBalancer
The preceding graphic shows the following concepts pertaining to OpenShift Container Platform Ingress LoadBalancerService endpoint publishing strategy:
- You can load balance externally, using the cloud provider load balancer, or internally, using the OpenShift Ingress Controller Load Balancer.
- You can use the single IP address of the load balancer and more familiar ports, such as 8080 and 4200 as shown on the cluster depicted in the graphic.
- Traffic from the external load balancer is directed at the pods, and managed by the load balancer, as depicted in the instance of a down node. See the Kubernetes Services documentation for implementation details.
Prerequisites
-
Install the OpenShift CLI (
oc). -
Log in as a user with
cluster-adminprivileges.
Procedure
Create an
IngressControllercustom resource (CR) in a file named<name>-ingress-controller.yaml, such as in the following example:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create the Ingress Controller defined in the previous step by running the following command:
oc create -f <name>-ingress-controller.yaml
$ oc create -f <name>-ingress-controller.yaml1 Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Replace
<name>with the name of theIngressControllerobject.
Optional: Confirm that the Ingress Controller was created by running the following command:
oc --all-namespaces=true get ingresscontrollers
$ oc --all-namespaces=true get ingresscontrollersCopy to Clipboard Copied! Toggle word wrap Toggle overflow
7.8.8. Configuring global access for an Ingress Controller on Google Cloud Copy linkLink copied to clipboard!
An Ingress Controller created on Google Cloud with an internal load balancer generates an internal IP address for the service. A cluster administrator can specify the global access option, which enables clients in any region within the same VPC network and compute region as the load balancer, to reach the workloads running on your cluster.
For more information, see the Google Cloud documentation for global access.
Prerequisites
- You deployed an OpenShift Container Platform cluster on Google Cloud infrastructure.
- You configured an Ingress Controller to use an internal load balancer.
-
You installed the OpenShift CLI (
oc).
Procedure
Configure the Ingress Controller resource to allow global access.
NoteYou can also create an Ingress Controller and specify the global access option.
Configure the Ingress Controller resource:
oc -n openshift-ingress-operator edit ingresscontroller/default
$ oc -n openshift-ingress-operator edit ingresscontroller/defaultCopy to Clipboard Copied! Toggle word wrap Toggle overflow Edit the YAML file:
Sample
clientAccessconfiguration toGlobalCopy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Set
gcp.clientAccesstoGlobal.
- Save the file to apply the changes.
Run the following command to verify that the service allows global access:
oc -n openshift-ingress edit svc/router-default -o yaml
$ oc -n openshift-ingress edit svc/router-default -o yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow The output shows that global access is enabled for Google Cloud with the annotation,
networking.gke.io/internal-load-balancer-allow-global-access.
7.8.9. Setting the Ingress Controller health check interval Copy linkLink copied to clipboard!
A cluster administrator can set the health check interval to define how long the router waits between two consecutive health checks. This value is applied globally as a default for all routes. The default value is 5 seconds.
Prerequisites
- The following assumes that you already created an Ingress Controller.
Procedure
Update the Ingress Controller to change the interval between back end health checks:
oc -n openshift-ingress-operator patch ingresscontroller/default --type=merge -p '{"spec":{"tuningOptions": {"healthCheckInterval": "8s"}}}'$ oc -n openshift-ingress-operator patch ingresscontroller/default --type=merge -p '{"spec":{"tuningOptions": {"healthCheckInterval": "8s"}}}'Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteTo override the
healthCheckIntervalfor a single route, use the route annotationrouter.openshift.io/haproxy.health.check.interval
7.8.10. Configuring the default Ingress Controller for your cluster to be internal Copy linkLink copied to clipboard!
You can configure the default Ingress Controller for your cluster to be internal by deleting and recreating it.
If your cloud provider is Microsoft Azure, you must have at least one public load balancer that points to your nodes. If you do not, all of your nodes will lose egress connectivity to the internet.
If you want to change the scope for an IngressController, you can change the .spec.endpointPublishingStrategy.loadBalancer.scope parameter after the custom resource (CR) is created.
Prerequisites
-
Install the OpenShift CLI (
oc). -
Log in as a user with
cluster-adminprivileges.
Procedure
Configure the
defaultIngress Controller for your cluster to be internal by deleting and recreating it.Copy to Clipboard Copied! Toggle word wrap Toggle overflow
7.8.11. Configuring the route admission policy Copy linkLink copied to clipboard!
Administrators and application developers can run applications in multiple namespaces with the same domain name. This is for organizations where multiple teams develop microservices that are exposed on the same hostname.
Allowing claims across namespaces should only be enabled for clusters with trust between namespaces, otherwise a malicious user could take over a hostname. For this reason, the default admission policy disallows hostname claims across namespaces.
Prerequisites
- Cluster administrator privileges.
Procedure
Edit the
.spec.routeAdmissionfield of theingresscontrollerresource variable using the following command:oc -n openshift-ingress-operator patch ingresscontroller/default --patch '{"spec":{"routeAdmission":{"namespaceOwnership":"InterNamespaceAllowed"}}}' --type=merge$ oc -n openshift-ingress-operator patch ingresscontroller/default --patch '{"spec":{"routeAdmission":{"namespaceOwnership":"InterNamespaceAllowed"}}}' --type=mergeCopy to Clipboard Copied! Toggle word wrap Toggle overflow Sample Ingress Controller configuration
spec: routeAdmission: namespaceOwnership: InterNamespaceAllowed ...spec: routeAdmission: namespaceOwnership: InterNamespaceAllowed ...Copy to Clipboard Copied! Toggle word wrap Toggle overflow TipYou can alternatively apply the following YAML to configure the route admission policy:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
7.8.12. Using wildcard routes Copy linkLink copied to clipboard!
The HAProxy Ingress Controller has support for wildcard routes. The Ingress Operator uses wildcardPolicy to configure the ROUTER_ALLOW_WILDCARD_ROUTES environment variable of the Ingress Controller.
The default behavior of the Ingress Controller is to admit routes with a wildcard policy of None, which is backwards compatible with existing IngressController resources.
Procedure
Configure the wildcard policy.
Use the following command to edit the
IngressControllerresource:oc edit IngressController
$ oc edit IngressControllerCopy to Clipboard Copied! Toggle word wrap Toggle overflow Under
spec, set thewildcardPolicyfield toWildcardsDisallowedorWildcardsAllowed:spec: routeAdmission: wildcardPolicy: WildcardsDisallowed # or WildcardsAllowedspec: routeAdmission: wildcardPolicy: WildcardsDisallowed # or WildcardsAllowedCopy to Clipboard Copied! Toggle word wrap Toggle overflow
7.8.13. Using X-Forwarded headers Copy linkLink copied to clipboard!
You configure the HAProxy Ingress Controller to specify a policy for how to handle HTTP headers including Forwarded and X-Forwarded-For. The Ingress Operator uses the HTTPHeaders field to configure the ROUTER_SET_FORWARDED_HEADERS environment variable of the Ingress Controller.
Procedure
Configure the
HTTPHeadersfield for the Ingress Controller.Use the following command to edit the
IngressControllerresource:oc edit IngressController
$ oc edit IngressControllerCopy to Clipboard Copied! Toggle word wrap Toggle overflow Under
spec, set theHTTPHeaderspolicy field toAppend,Replace,IfNone, orNever:Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Example use cases
As a cluster administrator, you can:
Configure an external proxy that injects the
X-Forwarded-Forheader into each request before forwarding it to an Ingress Controller.To configure the Ingress Controller to pass the header through unmodified, you specify the
neverpolicy. The Ingress Controller then never sets the headers, and applications receive only the headers that the external proxy provides.Configure the Ingress Controller to pass the
X-Forwarded-Forheader that your external proxy sets on external cluster requests through unmodified.To configure the Ingress Controller to set the
X-Forwarded-Forheader on internal cluster requests, which do not go through the external proxy, specify theif-nonepolicy. If an HTTP request already has the header set through the external proxy, then the Ingress Controller preserves it. If the header is absent because the request did not come through the proxy, then the Ingress Controller adds the header.
As an application developer, you can:
Configure an application-specific external proxy that injects the
X-Forwarded-Forheader.To configure an Ingress Controller to pass the header through unmodified for an application’s Route, without affecting the policy for other Routes, add an annotation
haproxy.router.openshift.io/set-forwarded-headers: if-noneorhaproxy.router.openshift.io/set-forwarded-headers: neveron the Route for the application.NoteYou can set the
haproxy.router.openshift.io/set-forwarded-headersannotation on a per route basis, independent from the globally set value for the Ingress Controller.
7.8.14. Enabling HTTP/2 Ingress connectivity Copy linkLink copied to clipboard!
You can enable transparent end-to-end HTTP/2 connectivity in HAProxy. It allows application owners to make use of HTTP/2 protocol capabilities, including single connection, header compression, binary streams, and more.
You can enable HTTP/2 connectivity for an individual Ingress Controller or for the entire cluster.
To enable the use of HTTP/2 for the connection from the client to HAProxy, a route must specify a custom certificate. A route that uses the default certificate cannot use HTTP/2. This restriction is necessary to avoid problems from connection coalescing, where the client re-uses a connection for different routes that use the same certificate.
The connection from HAProxy to the application pod can use HTTP/2 only for re-encrypt routes and not for edge-terminated or insecure routes. This restriction is because HAProxy uses Application-Level Protocol Negotiation (ALPN), which is a TLS extension, to negotiate the use of HTTP/2 with the back-end. The implication is that end-to-end HTTP/2 is possible with passthrough and re-encrypt and not with insecure or edge-terminated routes.
Using WebSockets with a re-encrypt route and with HTTP/2 enabled on an Ingress Controller requires WebSocket support over HTTP/2. WebSockets over HTTP/2 is a feature of HAProxy 2.4, which is unsupported in OpenShift Container Platform at this time.
For non-passthrough routes, the Ingress Controller negotiates its connection to the application independently of the connection from the client. This means a client may connect to the Ingress Controller and negotiate HTTP/1.1, and the Ingress Controller may then connect to the application, negotiate HTTP/2, and forward the request from the client HTTP/1.1 connection using the HTTP/2 connection to the application. This poses a problem if the client subsequently tries to upgrade its connection from HTTP/1.1 to the WebSocket protocol, because the Ingress Controller cannot forward WebSocket to HTTP/2 and cannot upgrade its HTTP/2 connection to WebSocket. Consequently, if you have an application that is intended to accept WebSocket connections, it must not allow negotiating the HTTP/2 protocol or else clients will fail to upgrade to the WebSocket protocol.
Procedure
Enable HTTP/2 on a single Ingress Controller.
To enable HTTP/2 on an Ingress Controller, enter the
oc annotatecommand:oc -n openshift-ingress-operator annotate ingresscontrollers/<ingresscontroller_name> ingress.operator.openshift.io/default-enable-http2=true
$ oc -n openshift-ingress-operator annotate ingresscontrollers/<ingresscontroller_name> ingress.operator.openshift.io/default-enable-http2=trueCopy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<ingresscontroller_name>with the name of the Ingress Controller to annotate.
Enable HTTP/2 on the entire cluster.
To enable HTTP/2 for the entire cluster, enter the
oc annotatecommand:oc annotate ingresses.config/cluster ingress.operator.openshift.io/default-enable-http2=true
$ oc annotate ingresses.config/cluster ingress.operator.openshift.io/default-enable-http2=trueCopy to Clipboard Copied! Toggle word wrap Toggle overflow TipYou can alternatively apply the following YAML to add the annotation:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
7.8.15. Configuring the PROXY protocol for an Ingress Controller Copy linkLink copied to clipboard!
A cluster administrator can configure the PROXY protocol when an Ingress Controller uses either the HostNetwork or NodePortService endpoint publishing strategy types. The PROXY protocol enables the load balancer to preserve the original client addresses for connections that the Ingress Controller receives. The original client addresses are useful for logging, filtering, and injecting HTTP headers. In the default configuration, the connections that the Ingress Controller receives only contain the source address that is associated with the load balancer.
This feature is not supported in cloud deployments. This restriction is because when OpenShift Container Platform runs in a cloud platform, and an IngressController specifies that a service load balancer should be used, the Ingress Operator configures the load balancer service and enables the PROXY protocol based on the platform requirement for preserving source addresses.
You must configure both OpenShift Container Platform and the external load balancer to either use the PROXY protocol or to use TCP.
The PROXY protocol is unsupported for the default Ingress Controller with installer-provisioned clusters on non-cloud platforms that use a Keepalived Ingress VIP.
Prerequisites
- You created an Ingress Controller.
Procedure
Edit the Ingress Controller resource:
oc -n openshift-ingress-operator edit ingresscontroller/default
$ oc -n openshift-ingress-operator edit ingresscontroller/defaultCopy to Clipboard Copied! Toggle word wrap Toggle overflow Set the PROXY configuration:
If your Ingress Controller uses the hostNetwork endpoint publishing strategy type, set the
spec.endpointPublishingStrategy.hostNetwork.protocolsubfield toPROXY:Sample
hostNetworkconfiguration toPROXYspec: endpointPublishingStrategy: hostNetwork: protocol: PROXY type: HostNetworkspec: endpointPublishingStrategy: hostNetwork: protocol: PROXY type: HostNetworkCopy to Clipboard Copied! Toggle word wrap Toggle overflow If your Ingress Controller uses the NodePortService endpoint publishing strategy type, set the
spec.endpointPublishingStrategy.nodePort.protocolsubfield toPROXY:Sample
nodePortconfiguration toPROXYspec: endpointPublishingStrategy: nodePort: protocol: PROXY type: NodePortServicespec: endpointPublishingStrategy: nodePort: protocol: PROXY type: NodePortServiceCopy to Clipboard Copied! Toggle word wrap Toggle overflow
7.8.16. Specifying an alternative cluster domain using the appsDomain option Copy linkLink copied to clipboard!
As a cluster administrator, you can specify an alternative to the default cluster domain for user-created routes by configuring the appsDomain field. The appsDomain field is an optional domain for OpenShift Container Platform to use instead of the default, which is specified in the domain field. If you specify an alternative domain, it overrides the default cluster domain for the purpose of determining the default host for a new route.
For example, you can use the DNS domain for your company as the default domain for routes and ingresses for applications running on your cluster.
Prerequisites
- You deployed an OpenShift Container Platform cluster.
-
You installed the
occommand-line interface.
Procedure
Configure the
appsDomainfield by specifying an alternative default domain for user-created routes.Edit the ingress
clusterresource:oc edit ingresses.config/cluster -o yaml
$ oc edit ingresses.config/cluster -o yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Edit the YAML file:
Sample
appsDomainconfiguration totest.example.comCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Verify that an existing route contains the domain name specified in the
appsDomainfield by exposing the route and verifying the route domain change:NoteWait for the
openshift-apiserverfinish rolling updates before exposing the route.Expose the route:
oc expose service hello-openshift route.route.openshift.io/hello-openshift exposed
$ oc expose service hello-openshift route.route.openshift.io/hello-openshift exposedCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output:
oc get routes NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD hello-openshift hello_openshift-<my_project>.test.example.com hello-openshift 8080-tcp None
$ oc get routes NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD hello-openshift hello_openshift-<my_project>.test.example.com hello-openshift 8080-tcp NoneCopy to Clipboard Copied! Toggle word wrap Toggle overflow
7.8.17. Converting HTTP header case Copy linkLink copied to clipboard!
HAProxy lowercases HTTP header names by default; for example, changing Host: xyz.com to host: xyz.com. If legacy applications are sensitive to the capitalization of HTTP header names, use the Ingress Controller spec.httpHeaders.headerNameCaseAdjustments API field for a solution to accommodate legacy applications until they can be fixed.
OpenShift Container Platform includes HAProxy 2.2. If you want to update to this version of the web-based load balancer, ensure that you add the spec.httpHeaders.headerNameCaseAdjustments section to your cluster’s configuration file.
As a cluster administrator, you can convert the HTTP header case by entering the oc patch command or by setting the HeaderNameCaseAdjustments field in the Ingress Controller YAML file.
Prerequisites
-
You have installed the OpenShift CLI (
oc). -
You have access to the cluster as a user with the
cluster-adminrole.
Procedure
Capitalize an HTTP header by using the
oc patchcommand.Change the HTTP header from
hosttoHostby running the following command:oc -n openshift-ingress-operator patch ingresscontrollers/default --type=merge --patch='{"spec":{"httpHeaders":{"headerNameCaseAdjustments":["Host"]}}}'$ oc -n openshift-ingress-operator patch ingresscontrollers/default --type=merge --patch='{"spec":{"httpHeaders":{"headerNameCaseAdjustments":["Host"]}}}'Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create a
Routeresource YAML file so that the annotation can be applied to the application.Example of a route named
my-applicationCopy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Set
haproxy.router.openshift.io/h1-adjust-caseso that the Ingress Controller can adjust thehostrequest header as specified.
Specify adjustments by configuring the
HeaderNameCaseAdjustmentsfield in the Ingress Controller YAML configuration file.The following example Ingress Controller YAML file adjusts the
hostheader toHostfor HTTP/1 requests to appropriately annotated routes:Example Ingress Controller YAML
Copy to Clipboard Copied! Toggle word wrap Toggle overflow The following example route enables HTTP response header name case adjustments by using the
haproxy.router.openshift.io/h1-adjust-caseannotation:Example route YAML
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Set
haproxy.router.openshift.io/h1-adjust-caseto true.
7.8.18. Using router compression Copy linkLink copied to clipboard!
You configure the HAProxy Ingress Controller to specify router compression globally for specific MIME types. You can use the mimeTypes variable to define the formats of MIME types to which compression is applied. The types are: application, image, message, multipart, text, video, or a custom type prefaced by "X-". To see the full notation for MIME types and subtypes, see RFC1341.
Memory allocated for compression can affect the max connections. Additionally, compression of large buffers can cause latency, like heavy regex or long lists of regex.
Not all MIME types benefit from compression, but HAProxy still uses resources to try to compress if instructed to. Generally, text formats, such as html, css, and js, formats benefit from compression, but formats that are already compressed, such as image, audio, and video, benefit little in exchange for the time and resources spent on compression.
Procedure
Configure the
httpCompressionfield for the Ingress Controller.Use the following command to edit the
IngressControllerresource:oc edit -n openshift-ingress-operator ingresscontrollers/default
$ oc edit -n openshift-ingress-operator ingresscontrollers/defaultCopy to Clipboard Copied! Toggle word wrap Toggle overflow Under
spec, set thehttpCompressionpolicy field tomimeTypesand specify a list of MIME types that should have compression applied:Copy to Clipboard Copied! Toggle word wrap Toggle overflow
7.8.19. Exposing router metrics Copy linkLink copied to clipboard!
You can expose the HAProxy router metrics by default in Prometheus format on the default stats port, 1936. The external metrics collection and aggregation systems such as Prometheus can access the HAProxy router metrics. You can view the HAProxy router metrics in a browser in the HTML and comma separated values (CSV) format.
Prerequisites
- You configured your firewall to access the default stats port, 1936.
Procedure
Get the router pod name by running the following command:
oc get pods -n openshift-ingress
$ oc get pods -n openshift-ingressCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
NAME READY STATUS RESTARTS AGE router-default-76bfffb66c-46qwp 1/1 Running 0 11h
NAME READY STATUS RESTARTS AGE router-default-76bfffb66c-46qwp 1/1 Running 0 11hCopy to Clipboard Copied! Toggle word wrap Toggle overflow Get the router’s username and password, which the router pod stores in the
/var/lib/haproxy/conf/metrics-auth/statsUsernameand/var/lib/haproxy/conf/metrics-auth/statsPasswordfiles:Get the username by running the following command:
oc rsh <router_pod_name> cat metrics-auth/statsUsername
$ oc rsh <router_pod_name> cat metrics-auth/statsUsernameCopy to Clipboard Copied! Toggle word wrap Toggle overflow Get the password by running the following command:
oc rsh <router_pod_name> cat metrics-auth/statsPassword
$ oc rsh <router_pod_name> cat metrics-auth/statsPasswordCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Get the router IP and metrics certificates by running the following command:
oc describe pod <router_pod>
$ oc describe pod <router_pod>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Get the raw statistics in Prometheus format by running the following command:
curl -u <user>:<password> http://<router_IP>:<stats_port>/metrics
$ curl -u <user>:<password> http://<router_IP>:<stats_port>/metricsCopy to Clipboard Copied! Toggle word wrap Toggle overflow Access the metrics securely by running the following command:
curl -u user:password https://<router_IP>:<stats_port>/metrics -k
$ curl -u user:password https://<router_IP>:<stats_port>/metrics -kCopy to Clipboard Copied! Toggle word wrap Toggle overflow Access the default stats port, 1936, by running the following command:
curl -u <user>:<password> http://<router_IP>:<stats_port>/metrics
$ curl -u <user>:<password> http://<router_IP>:<stats_port>/metricsCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example 7.1. Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Launch the stats window by entering the following URL in a browser:
http://<user>:<password>@<router_IP>:<stats_port>
http://<user>:<password>@<router_IP>:<stats_port>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Optional: Get the stats in CSV format by entering the following URL in a browser:
http://<user>:<password>@<router_ip>:1936/metrics;csv
http://<user>:<password>@<router_ip>:1936/metrics;csvCopy to Clipboard Copied! Toggle word wrap Toggle overflow
7.8.20. Customizing HAProxy error code response pages Copy linkLink copied to clipboard!
As a cluster administrator, you can specify a custom error code response page for either 503, 404, or both error pages. The HAProxy router serves a 503 error page when the application pod is not running or a 404 error page when the requested URL does not exist. For example, if you customize the 503 error code response page, then the page is served when the application pod is not running, and the default 404 error code HTTP response page is served by the HAProxy router for an incorrect route or a non-existing route.
Custom error code response pages are specified in a config map then patched to the Ingress Controller. The config map keys have two available file names as follows: error-page-503.http and error-page-404.http.
Custom HTTP error code response pages must follow the HAProxy HTTP error page configuration guidelines. Here is an example of the default OpenShift Container Platform HAProxy router http 503 error code response page. You can use the default content as a template for creating your own custom page.
By default, the HAProxy router serves only a 503 error page when the application is not running or when the route is incorrect or non-existent. This default behavior is the same as the behavior on OpenShift Container Platform 4.8 and earlier. If a config map for the customization of an HTTP error code response is not provided, and you are using a custom HTTP error code response page, the router serves a default 404 or 503 error code response page.
If you use the OpenShift Container Platform default 503 error code page as a template for your customizations, the headers in the file require an editor that can use CRLF line endings.
Procedure
Create a config map named
my-custom-error-code-pagesin theopenshift-confignamespace:oc -n openshift-config create configmap my-custom-error-code-pages \ --from-file=error-page-503.http \ --from-file=error-page-404.http
$ oc -n openshift-config create configmap my-custom-error-code-pages \ --from-file=error-page-503.http \ --from-file=error-page-404.httpCopy to Clipboard Copied! Toggle word wrap Toggle overflow ImportantIf you do not specify the correct format for the custom error code response page, a router pod outage occurs. To resolve this outage, you must delete or correct the config map and delete the affected router pods so they can be recreated with the correct information.
Patch the Ingress Controller to reference the
my-custom-error-code-pagesconfig map by name:oc patch -n openshift-ingress-operator ingresscontroller/default --patch '{"spec":{"httpErrorCodePages":{"name":"my-custom-error-code-pages"}}}' --type=merge$ oc patch -n openshift-ingress-operator ingresscontroller/default --patch '{"spec":{"httpErrorCodePages":{"name":"my-custom-error-code-pages"}}}' --type=mergeCopy to Clipboard Copied! Toggle word wrap Toggle overflow The Ingress Operator copies the
my-custom-error-code-pagesconfig map from theopenshift-confignamespace to theopenshift-ingressnamespace. The Operator names the config map according to the pattern,<your_ingresscontroller_name>-errorpages, in theopenshift-ingressnamespace.Display the copy:
oc get cm default-errorpages -n openshift-ingress
$ oc get cm default-errorpages -n openshift-ingressCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
NAME DATA AGE default-errorpages 2 25s
NAME DATA AGE default-errorpages 2 25s1 Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- The example config map name is
default-errorpagesbecause thedefaultIngress Controller custom resource (CR) was patched.
Confirm that the config map containing the custom error response page mounts on the router volume where the config map key is the filename that has the custom HTTP error code response:
For 503 custom HTTP custom error code response:
oc -n openshift-ingress rsh <router_pod> cat /var/lib/haproxy/conf/error_code_pages/error-page-503.http
$ oc -n openshift-ingress rsh <router_pod> cat /var/lib/haproxy/conf/error_code_pages/error-page-503.httpCopy to Clipboard Copied! Toggle word wrap Toggle overflow For 404 custom HTTP custom error code response:
oc -n openshift-ingress rsh <router_pod> cat /var/lib/haproxy/conf/error_code_pages/error-page-404.http
$ oc -n openshift-ingress rsh <router_pod> cat /var/lib/haproxy/conf/error_code_pages/error-page-404.httpCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Verification
Verify your custom error code HTTP response:
Create a test project and application:
oc new-project test-ingress
$ oc new-project test-ingressCopy to Clipboard Copied! Toggle word wrap Toggle overflow oc new-app django-psql-example
$ oc new-app django-psql-exampleCopy to Clipboard Copied! Toggle word wrap Toggle overflow For 503 custom http error code response:
- Stop all the pods for the application.
Run the following curl command or visit the route hostname in the browser:
curl -vk <route_hostname>
$ curl -vk <route_hostname>Copy to Clipboard Copied! Toggle word wrap Toggle overflow
For 404 custom http error code response:
- Visit a non-existent route or an incorrect route.
Run the following curl command or visit the route hostname in the browser:
curl -vk <route_hostname>
$ curl -vk <route_hostname>Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Check if the
errorfileattribute is properly in thehaproxy.configfile:oc -n openshift-ingress rsh <router> cat /var/lib/haproxy/conf/haproxy.config | grep errorfile
$ oc -n openshift-ingress rsh <router> cat /var/lib/haproxy/conf/haproxy.config | grep errorfileCopy to Clipboard Copied! Toggle word wrap Toggle overflow
7.8.21. Setting the Ingress Controller maximum connections Copy linkLink copied to clipboard!
A cluster administrator can set the maximum number of simultaneous connections for OpenShift router deployments. You can patch an existing Ingress Controller to increase the maximum number of connections.
Prerequisites
- The following assumes that you already created an Ingress Controller
Procedure
Update the Ingress Controller to change the maximum number of connections for HAProxy:
oc -n openshift-ingress-operator patch ingresscontroller/default --type=merge -p '{"spec":{"tuningOptions": {"maxConnections": 7500}}}'$ oc -n openshift-ingress-operator patch ingresscontroller/default --type=merge -p '{"spec":{"tuningOptions": {"maxConnections": 7500}}}'Copy to Clipboard Copied! Toggle word wrap Toggle overflow WarningIf you set the
spec.tuningOptions.maxConnectionsvalue greater than the current operating system limit, the HAProxy process will not start. See the table in the "Ingress Controller configuration parameters" section for more information about this parameter.
Chapter 8. Ingress Node Firewall Operator in OpenShift Container Platform Copy linkLink copied to clipboard!
The Ingress Node Firewall Operator provides a stateless, eBPF-based firewall for managing node-level ingress traffic in OpenShift Container Platform.
8.1. Ingress Node Firewall Operator Copy linkLink copied to clipboard!
The Ingress Node Firewall Operator provides ingress firewall rules at a node level by deploying the daemon set to nodes you specify and manage in the firewall configurations. To deploy the daemon set, you create an IngressNodeFirewallConfig custom resource (CR). The Operator applies the IngressNodeFirewallConfig CR to create ingress node firewall daemon set daemon, which run on all nodes that match the nodeSelector.
You configure rules of the IngressNodeFirewall CR and apply them to clusters using the nodeSelector and setting values to "true".
The Ingress Node Firewall Operator supports only stateless firewall rules.
The maximum transmission units (MTU) parameter is 4Kb (kilobytes) in OpenShift Container Platform 4.12.
Network interface controllers (NICs) that do not support native XDP drivers will run at a lower performance.
Ingress Node Firewall Operator is not supported on Amazon Web Services (AWS) with the default OpenShift installation or on Red Hat OpenShift Service on AWS (ROSA). For more information on Red Hat OpenShift Service on AWS support and ingress, see Ingress Operator in Red Hat OpenShift Service on AWS.
8.2. Installing the Ingress Node Firewall Operator Copy linkLink copied to clipboard!
As a cluster administrator, you can install the Ingress Node Firewall Operator by using the OpenShift Container Platform CLI or the web console.
8.2.1. Installing the Ingress Node Firewall Operator using the CLI Copy linkLink copied to clipboard!
As a cluster administrator, you can install the Operator using the CLI.
Prerequisites
-
You have installed the OpenShift CLI (
oc). - You have an account with administrator privileges.
Procedure
To create the
openshift-ingress-node-firewallnamespace, enter the following command:Copy to Clipboard Copied! Toggle word wrap Toggle overflow To create an
OperatorGroupCR, enter the following command:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Subscribe to the Ingress Node Firewall Operator.
To create a
SubscriptionCR for the Ingress Node Firewall Operator, enter the following command:Copy to Clipboard Copied! Toggle word wrap Toggle overflow
To verify that the Operator is installed, enter the following command:
oc get ip -n openshift-ingress-node-firewall
$ oc get ip -n openshift-ingress-node-firewallCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
NAME CSV APPROVAL APPROVED install-5cvnz ingress-node-firewall.4.12.0-202211122336 Automatic true
NAME CSV APPROVAL APPROVED install-5cvnz ingress-node-firewall.4.12.0-202211122336 Automatic trueCopy to Clipboard Copied! Toggle word wrap Toggle overflow To verify the version of the Operator, enter the following command:
oc get csv -n openshift-ingress-node-firewall
$ oc get csv -n openshift-ingress-node-firewallCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
NAME DISPLAY VERSION REPLACES PHASE ingress-node-firewall.4.12.0-202211122336 Ingress Node Firewall Operator 4.12.0-202211122336 ingress-node-firewall.4.12.0-202211102047 Succeeded
NAME DISPLAY VERSION REPLACES PHASE ingress-node-firewall.4.12.0-202211122336 Ingress Node Firewall Operator 4.12.0-202211122336 ingress-node-firewall.4.12.0-202211102047 SucceededCopy to Clipboard Copied! Toggle word wrap Toggle overflow
8.2.2. Installing the Ingress Node Firewall Operator using the web console Copy linkLink copied to clipboard!
As a cluster administrator, you can install the Operator using the web console.
Prerequisites
-
You have installed the OpenShift CLI (
oc). - You have an account with administrator privileges.
Procedure
Install the Ingress Node Firewall Operator:
- In the OpenShift Container Platform web console, click Operators → OperatorHub.
- Select Ingress Node Firewall Operator from the list of available Operators, and then click Install.
- On the Install Operator page, under Installed Namespace, select Operator recommended Namespace.
- Click Install.
Verify that the Ingress Node Firewall Operator is installed successfully:
- Navigate to the Operators → Installed Operators page.
Ensure that Ingress Node Firewall Operator is listed in the openshift-ingress-node-firewall project with a Status of InstallSucceeded.
NoteDuring installation an Operator might display a Failed status. If the installation later succeeds with an InstallSucceeded message, you can ignore the Failed message.
If the Operator does not have a Status of InstallSucceeded, troubleshoot using the following steps:
- Inspect the Operator Subscriptions and Install Plans tabs for any failures or errors under Status.
-
Navigate to the Workloads → Pods page and check the logs for pods in the
openshift-ingress-node-firewallproject. Check the namespace of the YAML file. If the annotation is missing, you can add the annotation
workload.openshift.io/allowed=managementto the Operator namespace with the following command:oc annotate ns/openshift-ingress-node-firewall workload.openshift.io/allowed=management
$ oc annotate ns/openshift-ingress-node-firewall workload.openshift.io/allowed=managementCopy to Clipboard Copied! Toggle word wrap Toggle overflow NoteFor single-node OpenShift clusters, the
openshift-ingress-node-firewallnamespace requires theworkload.openshift.io/allowed=managementannotation.
8.3. Deploying Ingress Node Firewall Operator Copy linkLink copied to clipboard!
Prerequisite
- The Ingress Node Firewall Operator is installed.
Procedure
To deploy the Ingress Node Firewall Operator, create a IngressNodeFirewallConfig custom resource that will deploy the Operator’s daemon set. You can deploy one or multiple IngressNodeFirewall CRDs to nodes by applying firewall rules.
-
Create the
IngressNodeFirewallConfiginside theopenshift-ingress-node-firewallnamespace namedingressnodefirewallconfig. Run the following command to deploy Ingress Node Firewall Operator rules:
oc apply -f rule.yaml
$ oc apply -f rule.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
8.3.1. Ingress Node Firewall configuration object Copy linkLink copied to clipboard!
The fields for the Ingress Node Firewall configuration object are described in the following table:
| Field | Type | Description |
|---|---|---|
|
|
|
The name of the CR object. The name of the firewall rules object must be |
|
|
|
Namespace for the Ingress Firewall Operator CR object. The |
|
|
| A node selection constraint used to target nodes through specified node labels. For example: spec:
nodeSelector:
node-role.kubernetes.io/worker: ""
Note
One label used in |
The Operator consumes the CR and creates an ingress node firewall daemon set on all the nodes that match the nodeSelector.
Ingress Node Firewall Operator example configuration
A complete Ingress Node Firewall Configuration is specified in the following example:
Example Ingress Node Firewall Configuration object
The Operator consumes the CR and creates an ingress node firewall daemon set on all the nodes that match the nodeSelector.
8.3.2. Ingress Node Firewall rules object Copy linkLink copied to clipboard!
The fields for the Ingress Node Firewall rules object are described in the following table:
| Field | Type | Description |
|---|---|---|
|
|
| The name of the CR object. |
|
|
|
The fields for this object specify the interfaces to apply the firewall rules to. For example, |
|
|
|
You can use |
|
|
|
|
8.3.2.1. Ingress object configuration Copy linkLink copied to clipboard!
The values for the ingress object are defined in the following table:
| Field | Type | Description |
|---|---|---|
|
|
| Allows you to set the CIDR block. You can configure multiple CIDRs from different address families. Note
Different CIDRs allow you to use the same order rule. In the case that there are multiple |
|
|
|
Ingress firewall
Set Note Ingress firewall rules are verified using a verification webhook that blocks any invalid configuration. The verification webhook prevents you from blocking any critical cluster services such as the API server or SSH. |
8.3.2.2. Ingress Node Firewall rules object example Copy linkLink copied to clipboard!
A complete Ingress Node Firewall configuration is specified in the following example:
Example Ingress Node Firewall configuration
- 1
- A <label_name> and a <label_value> must exist on the node and must match the
nodeselectorlabel and value applied to the nodes you want theingressfirewallconfigCR to run on. The <label_value> can betrueorfalse. By usingnodeSelectorlabels, you can target separate groups of nodes to apply different rules to using theingressfirewallconfigCR.
8.3.2.3. Zero trust Ingress Node Firewall rules object example Copy linkLink copied to clipboard!
Zero trust Ingress Node Firewall rules can provide additional security to multi-interface clusters. For example, you can use zero trust Ingress Node Firewall rules to drop all traffic on a specific interface except for SSH.
A complete configuration of a zero trust Ingress Node Firewall rule set is specified in the following example:
Users need to add all ports their application will use to their allowlist in the following case to ensure proper functionality.
Example zero trust Ingress Node Firewall rules
8.4. Viewing Ingress Node Firewall Operator rules Copy linkLink copied to clipboard!
Procedure
Run the following command to view all current rules :
oc get ingressnodefirewall
$ oc get ingressnodefirewallCopy to Clipboard Copied! Toggle word wrap Toggle overflow Choose one of the returned
<resource>names and run the following command to view the rules or configs:oc get <resource> <name> -o yaml
$ oc get <resource> <name> -o yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
8.5. Troubleshooting the Ingress Node Firewall Operator Copy linkLink copied to clipboard!
Run the following command to list installed Ingress Node Firewall custom resource definitions (CRD):
oc get crds | grep ingressnodefirewall
$ oc get crds | grep ingressnodefirewallCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
NAME READY UP-TO-DATE AVAILABLE AGE ingressnodefirewallconfigs.ingressnodefirewall.openshift.io 2022-08-25T10:03:01Z ingressnodefirewallnodestates.ingressnodefirewall.openshift.io 2022-08-25T10:03:00Z ingressnodefirewalls.ingressnodefirewall.openshift.io 2022-08-25T10:03:00Z
NAME READY UP-TO-DATE AVAILABLE AGE ingressnodefirewallconfigs.ingressnodefirewall.openshift.io 2022-08-25T10:03:01Z ingressnodefirewallnodestates.ingressnodefirewall.openshift.io 2022-08-25T10:03:00Z ingressnodefirewalls.ingressnodefirewall.openshift.io 2022-08-25T10:03:00ZCopy to Clipboard Copied! Toggle word wrap Toggle overflow Run the following command to view the state of the Ingress Node Firewall Operator:
oc get pods -n openshift-ingress-node-firewall
$ oc get pods -n openshift-ingress-node-firewallCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
NAME READY STATUS RESTARTS AGE ingress-node-firewall-controller-manager 2/2 Running 0 5d21h ingress-node-firewall-daemon-pqx56 3/3 Running 0 5d21h
NAME READY STATUS RESTARTS AGE ingress-node-firewall-controller-manager 2/2 Running 0 5d21h ingress-node-firewall-daemon-pqx56 3/3 Running 0 5d21hCopy to Clipboard Copied! Toggle word wrap Toggle overflow The following fields provide information about the status of the Operator:
READY,STATUS,AGE, andRESTARTS. TheSTATUSfield isRunningwhen the Ingress Node Firewall Operator is deploying a daemon set to the assigned nodes.Run the following command to collect all ingress firewall node pods' logs:
oc adm must-gather – gather_ingress_node_firewall
$ oc adm must-gather – gather_ingress_node_firewallCopy to Clipboard Copied! Toggle word wrap Toggle overflow The logs are available in the sos node’s report containing eBPF
bpftooloutputs at/sos_commands/ebpf. These reports include lookup tables used or updated as the ingress firewall XDP handles packet processing, updates statistics, and emits events.
Chapter 9. Configuring an Ingress Controller for manual DNS Management Copy linkLink copied to clipboard!
As a cluster administrator, when you create an Ingress Controller, the Operator manages the DNS records automatically. This has some limitations when the required DNS zone is different from the cluster DNS zone or when the DNS zone is hosted outside the cloud provider.
As a cluster administrator, you can configure an Ingress Controller to stop automatic DNS management and start manual DNS management. Set dnsManagementPolicy to specify when it should be automatically or manually managed.
When you change an Ingress Controller from Managed to Unmanaged DNS management policy, the Operator does not clean up the previous wildcard DNS record provisioned on the cloud. When you change an Ingress Controller from Unmanaged to Managed DNS management policy, the Operator attempts to create the DNS record on the cloud provider if it does not exist or updates the DNS record if it already exists.
When you set dnsManagementPolicy to unmanaged, you have to manually manage the lifecycle of the wildcard DNS record on the cloud provider.
9.1. Managed DNS management policy Copy linkLink copied to clipboard!
The Managed DNS management policy for Ingress Controllers ensures that the lifecycle of the wildcard DNS record on the cloud provider is automatically managed by the Operator.
9.2. Unmanaged DNS management policy Copy linkLink copied to clipboard!
The Unmanaged DNS management policy for Ingress Controllers ensures that the lifecycle of the wildcard DNS record on the cloud provider is not automatically managed, instead it becomes the responsibility of the cluster administrator.
On the AWS cloud platform, if the domain on the Ingress Controller does not match with dnsConfig.Spec.BaseDomain then the DNS management policy is automatically set to Unmanaged.
9.3. Creating a custom Ingress Controller with the Unmanaged DNS management policy Copy linkLink copied to clipboard!
As a cluster administrator, you can create a new custom Ingress Controller with the Unmanaged DNS management policy.
Prerequisites
-
Install the OpenShift CLI (
oc). -
Log in as a user with
cluster-adminprivileges.
Procedure
Create a custom resource (CR) file named
sample-ingress.yamlcontaining the following:Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Specify the
<name>with a name for theIngressControllerobject. - 2
- Specify the
domainbased on the DNS record that was created as a prerequisite. - 3
- Specify the
scopeasExternalto expose the load balancer externally. - 4
dnsManagementPolicyindicates if the Ingress Controller is managing the lifecycle of the wildcard DNS record associated with the load balancer. The valid values areManagedandUnmanaged. The default value isManaged.
Save the file to apply the changes.
oc apply -f <name>.yaml
oc apply -f <name>.yaml1 Copy to Clipboard Copied! Toggle word wrap Toggle overflow
9.4. Modifying an existing Ingress Controller Copy linkLink copied to clipboard!
As a cluster administrator, you can modify an existing Ingress Controller to manually manage the DNS record lifecycle.
Prerequisites
-
Install the OpenShift CLI (
oc). -
Log in as a user with
cluster-adminprivileges.
Procedure
Modify the chosen
IngressControllerto setdnsManagementPolicy:SCOPE=$(oc -n openshift-ingress-operator get ingresscontroller <name> -o=jsonpath="{.status.endpointPublishingStrategy.loadBalancer.scope}") oc -n openshift-ingress-operator patch ingresscontrollers/<name> --type=merge --patch='{"spec":{"endpointPublishingStrategy":{"type":"LoadBalancerService","loadBalancer":{"dnsManagementPolicy":"Unmanaged", "scope":"${SCOPE}"}}}}'SCOPE=$(oc -n openshift-ingress-operator get ingresscontroller <name> -o=jsonpath="{.status.endpointPublishingStrategy.loadBalancer.scope}") oc -n openshift-ingress-operator patch ingresscontrollers/<name> --type=merge --patch='{"spec":{"endpointPublishingStrategy":{"type":"LoadBalancerService","loadBalancer":{"dnsManagementPolicy":"Unmanaged", "scope":"${SCOPE}"}}}}'Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Optional: You can delete the associated DNS record in the cloud provider.
Chapter 10. Verifying connectivity to an endpoint Copy linkLink copied to clipboard!
The Cluster Network Operator (CNO) runs a controller, the connectivity check controller, that performs a connection health check between resources within your cluster. By reviewing the results of the health checks, you can diagnose connection problems or eliminate network connectivity as the cause of an issue that you are investigating.
10.1. Connection health checks that are performed Copy linkLink copied to clipboard!
To verify that cluster resources are reachable, a TCP connection is made to each of the following cluster API services:
- Kubernetes API server service
- Kubernetes API server endpoints
- OpenShift API server service
- OpenShift API server endpoints
- Load balancers
To verify that services and service endpoints are reachable on every node in the cluster, a TCP connection is made to each of the following targets:
- Health check target service
- Health check target endpoints
10.2. Implementation of connection health checks Copy linkLink copied to clipboard!
The connectivity check controller orchestrates connection verification checks in your cluster. The results for the connection tests are stored in PodNetworkConnectivity objects in the openshift-network-diagnostics namespace. Connection tests are performed every minute in parallel.
The Cluster Network Operator (CNO) deploys several resources to the cluster to send and receive connectivity health checks:
- Health check source
-
This program deploys in a single pod replica set managed by a
Deploymentobject. The program consumesPodNetworkConnectivityobjects and connects to thespec.targetEndpointspecified in each object. - Health check target
- A pod deployed as part of a daemon set on every node in the cluster. The pod listens for inbound health checks. The presence of this pod on every node allows for the testing of connectivity to each node.
You can configure the nodes which network connectivity sources and targets run on with a node selector. Additionally, you can specify permissible tolerations for source and target pods. The configuration is defined in the singleton cluster custom resource of the Network API in the config.openshift.io/v1 API group.
Pod scheduling occurs after you have updated the configuration. Therefore, you must apply node labels that you intend to use in your selectors before updating the configuration. Labels applied after updating your network connectivity check pod placement are ignored.
Refer to the default configuration in the following YAML:
Default configuration for connectivity source and target pods
- 1 1
- Specifies the network diagnostics configuration. If a value is not specified or an empty object is specified, and
spec.disableNetworkDiagnostics=trueis set in thenetwork.operator.openshift.iocustom resource namedcluster, network diagnostics are disabled. If set, this value overridesspec.disableNetworkDiagnostics=true. - 2
- Specifies the diagnostics mode. The value can be the empty string,
All, orDisabled. The empty string is equivalent to specifyingAll. - 3
- Optional: Specifies a selector for connectivity check source pods. You can use the
nodeSelectorandtolerationsfields to further specify thesourceNodepods. These are optional for both source and target pods. You can omit them, use both, or use only one of them. - 4
- Optional: Specifies a selector for connectivity check target pods. You can use the
nodeSelectorandtolerationsfields to further specify thetargetNodepods. These are optional for both source and target pods. You can omit them, use both, or use only one of them.
10.3. PodNetworkConnectivityCheck object fields Copy linkLink copied to clipboard!
The PodNetworkConnectivityCheck object fields are described in the following tables.
| Field | Type | Description |
|---|---|---|
|
|
|
The name of the object in the following format:
|
|
|
|
The namespace that the object is associated with. This value is always |
|
|
|
The name of the pod where the connection check originates, such as |
|
|
|
The target of the connection check, such as |
|
|
| Configuration for the TLS certificate to use. |
|
|
| The name of the TLS certificate used, if any. The default value is an empty string. |
|
|
| An object representing the condition of the connection test and logs of recent connection successes and failures. |
|
|
| The latest status of the connection check and any previous statuses. |
|
|
| Connection test logs from unsuccessful attempts. |
|
|
| Connect test logs covering the time periods of any outages. |
|
|
| Connection test logs from successful attempts. |
The following table describes the fields for objects in the status.conditions array:
| Field | Type | Description |
|---|---|---|
|
|
| The time that the condition of the connection transitioned from one status to another. |
|
|
| The details about last transition in a human readable format. |
|
|
| The last status of the transition in a machine readable format. |
|
|
| The status of the condition. |
|
|
| The type of the condition. |
The following table describes the fields for objects in the status.conditions array:
| Field | Type | Description |
|---|---|---|
|
|
| The timestamp from when the connection failure is resolved. |
|
|
| Connection log entries, including the log entry related to the successful end of the outage. |
|
|
| A summary of outage details in a human readable format. |
|
|
| The timestamp from when the connection failure is first detected. |
|
|
| Connection log entries, including the original failure. |
10.3.1. Connection log fields Copy linkLink copied to clipboard!
The fields for a connection log entry are described in the following table. The object is used in the following fields:
-
status.failures[] -
status.successes[] -
status.outages[].startLogs[] -
status.outages[].endLogs[]
| Field | Type | Description |
|---|---|---|
|
|
| Records the duration of the action. |
|
|
| Provides the status in a human readable format. |
|
|
|
Provides the reason for status in a machine readable format. The value is one of |
|
|
| Indicates if the log entry is a success or failure. |
|
|
| The start time of connection check. |
10.4. Verifying network connectivity for an endpoint Copy linkLink copied to clipboard!
As a cluster administrator, you can verify the connectivity of an endpoint, such as an API server, load balancer, service, or pod.
Prerequisites
-
Install the OpenShift CLI (
oc). -
Access to the cluster as a user with the
cluster-adminrole.
Procedure
To list the current
PodNetworkConnectivityCheckobjects, enter the following command:oc get podnetworkconnectivitycheck -n openshift-network-diagnostics
$ oc get podnetworkconnectivitycheck -n openshift-network-diagnosticsCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow View the connection test logs:
- From the output of the previous command, identify the endpoint that you want to review the connectivity logs for.
View the object by entering the following command:
oc get podnetworkconnectivitycheck <name> \ -n openshift-network-diagnostics -o yaml
$ oc get podnetworkconnectivitycheck <name> \ -n openshift-network-diagnostics -o yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow where
<name>specifies the name of thePodNetworkConnectivityCheckobject.Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Chapter 11. Changing the MTU for the cluster network Copy linkLink copied to clipboard!
As a cluster administrator, you can change the MTU for the cluster network after cluster installation. This change is disruptive as cluster nodes must be rebooted to finalize the MTU change. You can change the MTU only for clusters using the OVN-Kubernetes or OpenShift SDN network plugins.
11.1. About the cluster MTU Copy linkLink copied to clipboard!
During installation the maximum transmission unit (MTU) for the cluster network is detected automatically based on the MTU of the primary network interface of nodes in the cluster. You do not normally need to override the detected MTU.
You might want to change the MTU of the cluster network for several reasons:
- The MTU detected during cluster installation is not correct for your infrastructure
- Your cluster infrastructure now requires a different MTU, such as from the addition of nodes that need a different MTU for optimal performance
You can change the cluster MTU for only the OVN-Kubernetes and OpenShift SDN cluster network plugins.
11.1.1. Service interruption considerations Copy linkLink copied to clipboard!
When you initiate an MTU change on your cluster the following effects might impact service availability:
- At least two rolling reboots are required to complete the migration to a new MTU. During this time, some nodes are not available as they restart.
- Specific applications deployed to the cluster with shorter timeout intervals than the absolute TCP timeout interval might experience disruption during the MTU change.
11.1.2. MTU value selection Copy linkLink copied to clipboard!
When planning your MTU migration there are two related but distinct MTU values to consider.
- Hardware MTU: This MTU value is set based on the specifics of your network infrastructure.
Cluster network MTU: This MTU value is always less than your hardware MTU to account for the cluster network overlay overhead. The specific overhead is determined by your network plugin:
-
OVN-Kubernetes:
100bytes -
OpenShift SDN:
50bytes
-
OVN-Kubernetes:
If your cluster requires different MTU values for different nodes, you must subtract the overhead value for your network plugin from the lowest MTU value that is used by any node in your cluster. For example, if some nodes in your cluster have an MTU of 9001, and some have an MTU of 1500, you must set this value to 1400.
To avoid selecting an MTU value that is not acceptable by a node, verify the maximum MTU value (maxmtu) that is accepted by the network interface by using the ip -d link command.
11.1.3. How the migration process works Copy linkLink copied to clipboard!
The following table summarizes the migration process by segmenting between the user-initiated steps in the process and the actions that the migration performs in response.
| User-initiated steps | OpenShift Container Platform activity |
|---|---|
| Set the following values in the Cluster Network Operator configuration:
| Cluster Network Operator (CNO): Confirms that each field is set to a valid value.
If the values provided are valid, the CNO writes out a new temporary configuration with the MTU for the cluster network set to the value of the Machine Config Operator (MCO): Performs a rolling reboot of each node in the cluster. |
| Reconfigure the MTU of the primary network interface for the nodes on the cluster. You can use a variety of methods to accomplish this, including:
| N/A |
|
Set the | Machine Config Operator (MCO): Performs a rolling reboot of each node in the cluster with the new MTU configuration. |
11.2. Changing the cluster MTU Copy linkLink copied to clipboard!
As a cluster administrator, you can change the maximum transmission unit (MTU) for your cluster. The migration is disruptive and nodes in your cluster might be temporarily unavailable as the MTU update rolls out.
The following procedure describes how to change the cluster MTU by using either machine configs, DHCP, or an ISO. If you use the DHCP or ISO approach, you must refer to configuration artifacts that you kept after installing your cluster to complete the procedure.
Prerequisites
-
You installed the OpenShift CLI (
oc). -
You are logged in to the cluster with a user with
cluster-adminprivileges. You identified the target MTU for your cluster. The correct MTU varies depending on the network plugin that your cluster uses:
-
OVN-Kubernetes: The cluster MTU must be set to
100less than the lowest hardware MTU value in your cluster. -
OpenShift SDN: The cluster MTU must be set to
50less than the lowest hardware MTU value in your cluster.
-
OVN-Kubernetes: The cluster MTU must be set to
- If your nodes are physical machines, ensure that the cluster network and the connected network switches support jumbo frames.
- If your nodes are virtual machines (VMs), ensure that the hypervisor and the connected network switches support jumbo frames.
Procedure
To increase or decrease the MTU for the cluster network complete the following procedure.
To obtain the current MTU for the cluster network, enter the following command:
oc describe network.config cluster
$ oc describe network.config clusterCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Prepare your configuration for the hardware MTU:
If your hardware MTU is specified with DHCP, update your DHCP configuration such as with the following dnsmasq configuration:
dhcp-option-force=26,<mtu>
dhcp-option-force=26,<mtu>Copy to Clipboard Copied! Toggle word wrap Toggle overflow where:
<mtu>- Specifies the hardware MTU for the DHCP server to advertise.
- If your hardware MTU is specified with a kernel command line with PXE, update that configuration accordingly.
If your hardware MTU is specified in a NetworkManager connection configuration, complete the following steps. This approach is the default for OpenShift Container Platform if you do not explicitly specify your network configuration with DHCP, a kernel command line, or some other method. Your cluster nodes must all use the same underlying network configuration for the following procedure to work unmodified.
Find the primary network interface:
If you are using the OpenShift SDN network plugin, enter the following command:
oc debug node/<node_name> -- chroot /host ip route list match 0.0.0.0/0 | awk '{print $5 }'$ oc debug node/<node_name> -- chroot /host ip route list match 0.0.0.0/0 | awk '{print $5 }'Copy to Clipboard Copied! Toggle word wrap Toggle overflow where:
<node_name>- Specifies the name of a node in your cluster.
If you are using the OVN-Kubernetes network plugin, enter the following command:
oc debug node/<node_name> -- chroot /host nmcli -g connection.interface-name c show ovs-if-phys0
$ oc debug node/<node_name> -- chroot /host nmcli -g connection.interface-name c show ovs-if-phys0Copy to Clipboard Copied! Toggle word wrap Toggle overflow where:
<node_name>- Specifies the name of a node in your cluster.
Create the following NetworkManager configuration in the
<interface>-mtu.conffile:Example NetworkManager connection configuration
[connection-<interface>-mtu] match-device=interface-name:<interface> ethernet.mtu=<mtu>
[connection-<interface>-mtu] match-device=interface-name:<interface> ethernet.mtu=<mtu>Copy to Clipboard Copied! Toggle word wrap Toggle overflow where:
<mtu>- Specifies the new hardware MTU value.
<interface>- Specifies the primary network interface name.
Create two
MachineConfigobjects, one for the control plane nodes and another for the worker nodes in your cluster:Create the following Butane config in the
control-plane-interface.bufile:NoteThe Butane version you specify in the config file should match the OpenShift Container Platform version and always ends in
0. For example,4.12.0. See "Creating machine configs with Butane" for information about Butane.Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create the following Butane config in the
worker-interface.bufile:NoteThe Butane version you specify in the config file should match the OpenShift Container Platform version and always ends in
0. For example,4.12.0. See "Creating machine configs with Butane" for information about Butane.Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create
MachineConfigobjects from the Butane configs by running the following command:for manifest in control-plane-interface worker-interface; do butane --files-dir . $manifest.bu > $manifest.yaml done$ for manifest in control-plane-interface worker-interface; do butane --files-dir . $manifest.bu > $manifest.yaml doneCopy to Clipboard Copied! Toggle word wrap Toggle overflow
To begin the MTU migration, specify the migration configuration by entering the following command. The Machine Config Operator performs a rolling reboot of the nodes in the cluster in preparation for the MTU change.
oc patch Network.operator.openshift.io cluster --type=merge --patch \ '{"spec": { "migration": { "mtu": { "network": { "from": <overlay_from>, "to": <overlay_to> } , "machine": { "to" : <machine_to> } } } } }'$ oc patch Network.operator.openshift.io cluster --type=merge --patch \ '{"spec": { "migration": { "mtu": { "network": { "from": <overlay_from>, "to": <overlay_to> } , "machine": { "to" : <machine_to> } } } } }'Copy to Clipboard Copied! Toggle word wrap Toggle overflow where:
<overlay_from>- Specifies the current cluster network MTU value.
<overlay_to>-
Specifies the target MTU for the cluster network. This value is set relative to the value for
<machine_to>and for OVN-Kubernetes must be100less and for OpenShift SDN must be50less. <machine_to>- Specifies the MTU for the primary network interface on the underlying host network.
Example that increases the cluster MTU
oc patch Network.operator.openshift.io cluster --type=merge --patch \ '{"spec": { "migration": { "mtu": { "network": { "from": 1400, "to": 9000 } , "machine": { "to" : 9100} } } } }'$ oc patch Network.operator.openshift.io cluster --type=merge --patch \ '{"spec": { "migration": { "mtu": { "network": { "from": 1400, "to": 9000 } , "machine": { "to" : 9100} } } } }'Copy to Clipboard Copied! Toggle word wrap Toggle overflow As the MCO updates machines in each machine config pool, it reboots each node one by one. You must wait until all the nodes are updated. Check the machine config pool status by entering the following command:
oc get mcp
$ oc get mcpCopy to Clipboard Copied! Toggle word wrap Toggle overflow A successfully updated node has the following status:
UPDATED=true,UPDATING=false,DEGRADED=false.NoteBy default, the MCO updates one machine per pool at a time, causing the total time the migration takes to increase with the size of the cluster.
Confirm the status of the new machine configuration on the hosts:
To list the machine configuration state and the name of the applied machine configuration, enter the following command:
oc describe node | egrep "hostname|machineconfig"
$ oc describe node | egrep "hostname|machineconfig"Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
kubernetes.io/hostname=master-0 machineconfiguration.openshift.io/currentConfig: rendered-master-c53e221d9d24e1c8bb6ee89dd3d8ad7b machineconfiguration.openshift.io/desiredConfig: rendered-master-c53e221d9d24e1c8bb6ee89dd3d8ad7b machineconfiguration.openshift.io/reason: machineconfiguration.openshift.io/state: Done
kubernetes.io/hostname=master-0 machineconfiguration.openshift.io/currentConfig: rendered-master-c53e221d9d24e1c8bb6ee89dd3d8ad7b machineconfiguration.openshift.io/desiredConfig: rendered-master-c53e221d9d24e1c8bb6ee89dd3d8ad7b machineconfiguration.openshift.io/reason: machineconfiguration.openshift.io/state: DoneCopy to Clipboard Copied! Toggle word wrap Toggle overflow Verify that the following statements are true:
-
The value of
machineconfiguration.openshift.io/statefield isDone. -
The value of the
machineconfiguration.openshift.io/currentConfigfield is equal to the value of themachineconfiguration.openshift.io/desiredConfigfield.
-
The value of
To confirm that the machine config is correct, enter the following command:
oc get machineconfig <config_name> -o yaml | grep ExecStart
$ oc get machineconfig <config_name> -o yaml | grep ExecStartCopy to Clipboard Copied! Toggle word wrap Toggle overflow where
<config_name>is the name of the machine config from themachineconfiguration.openshift.io/currentConfigfield.The machine config must include the following update to the systemd configuration:
ExecStart=/usr/local/bin/mtu-migration.sh
ExecStart=/usr/local/bin/mtu-migration.shCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Update the underlying network interface MTU value:
If you are specifying the new MTU with a NetworkManager connection configuration, enter the following command. The MachineConfig Operator automatically performs a rolling reboot of the nodes in your cluster.
for manifest in control-plane-interface worker-interface; do oc create -f $manifest.yaml done$ for manifest in control-plane-interface worker-interface; do oc create -f $manifest.yaml doneCopy to Clipboard Copied! Toggle word wrap Toggle overflow - If you are specifying the new MTU with a DHCP server option or a kernel command line and PXE, make the necessary changes for your infrastructure.
As the MCO updates machines in each machine config pool, it reboots each node one by one. You must wait until all the nodes are updated. Check the machine config pool status by entering the following command:
oc get mcp
$ oc get mcpCopy to Clipboard Copied! Toggle word wrap Toggle overflow A successfully updated node has the following status:
UPDATED=true,UPDATING=false,DEGRADED=false.NoteBy default, the MCO updates one machine per pool at a time, causing the total time the migration takes to increase with the size of the cluster.
Confirm the status of the new machine configuration on the hosts:
To list the machine configuration state and the name of the applied machine configuration, enter the following command:
oc describe node | egrep "hostname|machineconfig"
$ oc describe node | egrep "hostname|machineconfig"Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
kubernetes.io/hostname=master-0 machineconfiguration.openshift.io/currentConfig: rendered-master-c53e221d9d24e1c8bb6ee89dd3d8ad7b machineconfiguration.openshift.io/desiredConfig: rendered-master-c53e221d9d24e1c8bb6ee89dd3d8ad7b machineconfiguration.openshift.io/reason: machineconfiguration.openshift.io/state: Done
kubernetes.io/hostname=master-0 machineconfiguration.openshift.io/currentConfig: rendered-master-c53e221d9d24e1c8bb6ee89dd3d8ad7b machineconfiguration.openshift.io/desiredConfig: rendered-master-c53e221d9d24e1c8bb6ee89dd3d8ad7b machineconfiguration.openshift.io/reason: machineconfiguration.openshift.io/state: DoneCopy to Clipboard Copied! Toggle word wrap Toggle overflow Verify that the following statements are true:
-
The value of
machineconfiguration.openshift.io/statefield isDone. -
The value of the
machineconfiguration.openshift.io/currentConfigfield is equal to the value of themachineconfiguration.openshift.io/desiredConfigfield.
-
The value of
To confirm that the machine config is correct, enter the following command:
oc get machineconfig <config_name> -o yaml | grep path:
$ oc get machineconfig <config_name> -o yaml | grep path:Copy to Clipboard Copied! Toggle word wrap Toggle overflow where
<config_name>is the name of the machine config from themachineconfiguration.openshift.io/currentConfigfield.If the machine config is successfully deployed, the previous output contains the
/etc/NetworkManager/conf.d/99-<interface>-mtu.conffile path and theExecStart=/usr/local/bin/mtu-migration.shline.
To finalize the MTU migration, enter one of the following commands:
If you are using the OVN-Kubernetes network plugin:
oc patch Network.operator.openshift.io cluster --type=merge --patch \ '{"spec": { "migration": null, "defaultNetwork":{ "ovnKubernetesConfig": { "mtu": <mtu> }}}}'$ oc patch Network.operator.openshift.io cluster --type=merge --patch \ '{"spec": { "migration": null, "defaultNetwork":{ "ovnKubernetesConfig": { "mtu": <mtu> }}}}'Copy to Clipboard Copied! Toggle word wrap Toggle overflow where:
<mtu>-
Specifies the new cluster network MTU that you specified with
<overlay_to>.
If you are using the OpenShift SDN network plugin:
oc patch Network.operator.openshift.io cluster --type=merge --patch \ '{"spec": { "migration": null, "defaultNetwork":{ "openshiftSDNConfig": { "mtu": <mtu> }}}}'$ oc patch Network.operator.openshift.io cluster --type=merge --patch \ '{"spec": { "migration": null, "defaultNetwork":{ "openshiftSDNConfig": { "mtu": <mtu> }}}}'Copy to Clipboard Copied! Toggle word wrap Toggle overflow where:
<mtu>-
Specifies the new cluster network MTU that you specified with
<overlay_to>.
After finalizing the MTU migration, each MCP node is rebooted one by one. You must wait until all the nodes are updated. Check the machine config pool status by entering the following command:
oc get mcp
$ oc get mcpCopy to Clipboard Copied! Toggle word wrap Toggle overflow A successfully updated node has the following status:
UPDATED=true,UPDATING=false,DEGRADED=false.
Verification
You can verify that a node in your cluster uses an MTU that you specified in the previous procedure.
To get the current MTU for the cluster network, enter the following command:
oc describe network.config cluster
$ oc describe network.config clusterCopy to Clipboard Copied! Toggle word wrap Toggle overflow Get the current MTU for the primary network interface of a node.
To list the nodes in your cluster, enter the following command:
oc get nodes
$ oc get nodesCopy to Clipboard Copied! Toggle word wrap Toggle overflow To obtain the current MTU setting for the primary network interface on a node, enter the following command:
oc debug node/<node> -- chroot /host ip address show <interface>
$ oc debug node/<node> -- chroot /host ip address show <interface>Copy to Clipboard Copied! Toggle word wrap Toggle overflow where:
<node>- Specifies a node from the output from the previous step.
<interface>- Specifies the primary network interface name for the node.
Example output
ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 8051
ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 8051Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Chapter 12. Configuring the node port service range Copy linkLink copied to clipboard!
During cluster installation, you can configure the node port range to meet the requirements of your cluster. After cluster installation, only a cluster administrator can expand the range as a postinstallation task. If your cluster uses a large number of node ports, consider increasing the available port range according to the requirements of your cluster.
If you do not set a node port range during cluster installation, the default range of 30000-32768 applies to your cluster. In this situation, you can expand the range on either side, but you must preserve 30000-32768 within your new port range.
Red Hat has not performed testing outside the default port range of 30000-32768. For ranges outside the default port range, ensure that you test to verify the expanding node port range does not impact your cluster. In particular, ensure that there is:
- No overlap with any ports already in use by host processes
- No overlap with any ports already in use by pods that are configured with host networking
If you expanded the range and a port allocation issue occurs, create a new cluster and set the required range for it.
If you expand the node port range and OpenShift CLI (oc) stops working because of a port conflict with the OpenShift Container Platform API server, you must create a new cluster.
12.1. Expanding the node port range Copy linkLink copied to clipboard!
You can expand the node port range for your cluster. After you install your OpenShift Container Platform cluster, you cannot shrink the node port range on either side of the currently configured range.
Red Hat has not performed testing outside the default port range of 30000-32768. For ranges outside the default port range, ensure that you test to verify that expanding your node port range does not impact your cluster. If you expanded the range and a port allocation issue occurs, create a new cluster and set the required range for it.
Prerequisites
-
Installed the OpenShift CLI (
oc). -
Logged in to the cluster as a user with
cluster-adminprivileges. -
You ensured that your cluster infrastructure allows access to the ports that exist in the extended range. For example, if you expand the node port range to
30000-32900, your firewall or packet filtering configuration must allow the inclusive port range of30000-32900.
Procedure
To expand the range for the
serviceNodePortRangeparameter in thenetwork.config.openshift.ioobject that your cluster uses to manage traffic for pods, enter the following command:oc patch network.config.openshift.io cluster --type=merge -p \ '{ "spec": { "serviceNodePortRange": "<port_range>" } }'$ oc patch network.config.openshift.io cluster --type=merge -p \ '{ "spec": { "serviceNodePortRange": "<port_range>" } }'Copy to Clipboard Copied! Toggle word wrap Toggle overflow where:
<port_range>-
specifies your expanded range, such as
30000-32900.
TipYou can also apply the following YAML to update the node port range:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
network.config.openshift.io/cluster patched
network.config.openshift.io/cluster patchedCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Verification
To confirm that the updated configuration is active, enter the following command. The update can take several minutes to apply.
oc get configmaps -n openshift-kube-apiserver config \ -o jsonpath="{.data['config\.yaml']}" | \ grep -Eo '"service-node-port-range":["[[:digit:]]+-[[:digit:]]+"]'$ oc get configmaps -n openshift-kube-apiserver config \ -o jsonpath="{.data['config\.yaml']}" | \ grep -Eo '"service-node-port-range":["[[:digit:]]+-[[:digit:]]+"]'Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
"service-node-port-range":["30000-32900"]
"service-node-port-range":["30000-32900"]Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Chapter 13. Configuring IP failover Copy linkLink copied to clipboard!
This topic describes configuring IP failover for pods and services on your OpenShift Container Platform cluster.
IP failover uses Keepalived to host a set of externally accessible Virtual IP (VIP) addresses on a set of hosts. Each VIP address is only serviced by a single host at a time. Keepalived uses the Virtual Router Redundancy Protocol (VRRP) to determine which host, from the set of hosts, services which VIP. If a host becomes unavailable, or if the service that Keepalived is watching does not respond, the VIP is switched to another host from the set. This means a VIP is always serviced as long as a host is available.
Every VIP in the set is serviced by a node selected from the set. If a single node is available, the VIPs are served. There is no way to explicitly distribute the VIPs over the nodes, so there can be nodes with no VIPs and other nodes with many VIPs. If there is only one node, all VIPs are on it.
The administrator must ensure that all of the VIP addresses meet the following requirements:
- Accessible on the configured hosts from outside the cluster.
- Not used for any other purpose within the cluster.
Keepalived on each node determines whether the needed service is running. If it is, VIPs are supported and Keepalived participates in the negotiation to determine which node serves the VIP. For a node to participate, the service must be listening on the watch port on a VIP or the check must be disabled.
Each VIP in the set might be served by a different node.
IP failover monitors a port on each VIP to determine whether the port is reachable on the node. If the port is not reachable, the VIP is not assigned to the node. If the port is set to 0, this check is suppressed. The check script does the needed testing.
When a node running Keepalived passes the check script, the VIP on that node can enter the master state based on its priority and the priority of the current master and as determined by the preemption strategy.
A cluster administrator can provide a script through the OPENSHIFT_HA_NOTIFY_SCRIPT variable, and this script is called whenever the state of the VIP on the node changes. Keepalived uses the master state when it is servicing the VIP, the backup state when another node is servicing the VIP, or in the fault state when the check script fails. The notify script is called with the new state whenever the state changes.
You can create an IP failover deployment configuration on OpenShift Container Platform. The IP failover deployment configuration specifies the set of VIP addresses, and the set of nodes on which to service them. A cluster can have multiple IP failover deployment configurations, with each managing its own set of unique VIP addresses. Each node in the IP failover configuration runs an IP failover pod, and this pod runs Keepalived.
When using VIPs to access a pod with host networking, the application pod runs on all nodes that are running the IP failover pods. This enables any of the IP failover nodes to become the master and service the VIPs when needed. If application pods are not running on all nodes with IP failover, either some IP failover nodes never service the VIPs or some application pods never receive any traffic. Use the same selector and replication count, for both IP failover and the application pods, to avoid this mismatch.
While using VIPs to access a service, any of the nodes can be in the IP failover set of nodes, since the service is reachable on all nodes, no matter where the application pod is running. Any of the IP failover nodes can become master at any time. The service can either use external IPs and a service port or it can use a NodePort. Setting up a NodePort is a privileged operation.
When using external IPs in the service definition, the VIPs are set to the external IPs, and the IP failover monitoring port is set to the service port. When using a node port, the port is open on every node in the cluster, and the service load-balances traffic from whatever node currently services the VIP. In this case, the IP failover monitoring port is set to the NodePort in the service definition.
Even though a service VIP is highly available, performance can still be affected. Keepalived makes sure that each of the VIPs is serviced by some node in the configuration, and several VIPs can end up on the same node even when other nodes have none. Strategies that externally load-balance across a set of VIPs can be thwarted when IP failover puts multiple VIPs on the same node.
When you use ExternalIP, you can set up IP failover to have the same VIP range as the ExternalIP range. You can also disable the monitoring port. In this case, all of the VIPs appear on same node in the cluster. Any user can set up a service with an ExternalIP and make it highly available.
There are a maximum of 254 VIPs in the cluster.
13.1. IP failover environment variables Copy linkLink copied to clipboard!
The following table contains the variables used to configure IP failover.
| Variable Name | Default | Description |
|---|---|---|
|
|
|
The IP failover pod tries to open a TCP connection to this port on each Virtual IP (VIP). If connection is established, the service is considered to be running. If this port is set to |
|
|
The interface name that IP failover uses to send Virtual Router Redundancy Protocol (VRRP) traffic. The default value is
If your cluster uses the OVN-Kubernetes network plugin, set this value to | |
|
|
|
The number of replicas to create. This must match |
|
|
The list of IP address ranges to replicate. This must be provided. For example, | |
|
|
|
The offset value used to set the virtual router IDs. Using different offset values allows multiple IP failover configurations to exist within the same cluster. The default offset is |
|
|
The number of groups to create for VRRP. If not set, a group is created for each virtual IP range specified with the | |
|
| INPUT |
The name of the iptables chain, to automatically add an |
|
| The full path name in the pod file system of a script that is periodically run to verify the application is operating. | |
|
|
| The period, in seconds, that the check script is run. |
|
| The full path name in the pod file system of a script that is run whenever the state changes. | |
|
|
|
The strategy for handling a new higher priority host. The |
13.2. Configuring IP failover in your cluster Copy linkLink copied to clipboard!
As a cluster administrator, you can configure IP failover on an entire cluster, or on a subset of nodes, as defined by the label selector. You can also configure multiple IP failover deployments in your cluster, where each one is independent of the others.
The IP failover deployment ensures that a failover pod runs on each of the nodes matching the constraints or the label used.
This pod runs Keepalived, which can monitor an endpoint and use Virtual Router Redundancy Protocol (VRRP) to fail over the virtual IP (VIP) from one node to another if the first node cannot reach the service or endpoint.
For production use, set a selector that selects at least two nodes, and set replicas equal to the number of selected nodes.
Prerequisites
-
You are logged in to the cluster as a user with
cluster-adminprivileges. - You created a pull secret.
Red Hat OpenStack Platform (RHOSP) only:
- You installed an RHOSP client (RHCOS documentation) on the target environment.
-
You also downloaded the RHOSP
openrc.shrc file (RHCOS documentation).
Procedure
Create an IP failover service account:
oc create sa ipfailover
$ oc create sa ipfailoverCopy to Clipboard Copied! Toggle word wrap Toggle overflow Update security context constraints (SCC) for
hostNetwork:oc adm policy add-scc-to-user privileged -z ipfailover
$ oc adm policy add-scc-to-user privileged -z ipfailoverCopy to Clipboard Copied! Toggle word wrap Toggle overflow oc adm policy add-scc-to-user hostnetwork -z ipfailover
$ oc adm policy add-scc-to-user hostnetwork -z ipfailoverCopy to Clipboard Copied! Toggle word wrap Toggle overflow Red Hat OpenStack Platform (RHOSP) only: Complete the following steps to make a failover VIP address reachable on RHOSP ports.
Use the RHOSP CLI to show the default RHOSP API and VIP addresses in the
allowed_address_pairsparameter of your RHOSP cluster:openstack port show <cluster_name> -c allowed_address_pairs
$ openstack port show <cluster_name> -c allowed_address_pairsCopy to Clipboard Copied! Toggle word wrap Toggle overflow Output example
*Field* *Value* allowed_address_pairs ip_address='192.168.0.5', mac_address='fa:16:3e:31:f9:cb' ip_address='192.168.0.7', mac_address='fa:16:3e:31:f9:cb'*Field* *Value* allowed_address_pairs ip_address='192.168.0.5', mac_address='fa:16:3e:31:f9:cb' ip_address='192.168.0.7', mac_address='fa:16:3e:31:f9:cb'Copy to Clipboard Copied! Toggle word wrap Toggle overflow Set a different VIP address for the IP failover deployment and make the address reachable on RHOSP ports by entering the following command in the RHOSP CLI. Do not set any default RHOSP API and VIP addresses as the failover VIP address for the IP failover deployment.
Example of adding the
1.1.1.1failover IP address as an allowed address on RHOSP ports.openstack port set <cluster_name> --allowed-address ip-address=1.1.1.1,mac-address=fa:fa:16:3e:31:f9:cb
$ openstack port set <cluster_name> --allowed-address ip-address=1.1.1.1,mac-address=fa:fa:16:3e:31:f9:cbCopy to Clipboard Copied! Toggle word wrap Toggle overflow - Create a deployment YAML file to configure IP failover for your deployment. See "Example deployment YAML for IP failover configuration" in a later step.
Specify the following specification in the IP failover deployment so that you pass the failover VIP address to the
OPENSHIFT_HA_VIRTUAL_IPSenvironment variable:Example of adding the
1.1.1.1VIP address toOPENSHIFT_HA_VIRTUAL_IPSCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Create a deployment YAML file to configure IP failover.
NoteFor Red Hat OpenStack Platform (RHOSP), you do not need to re-create the deployment YAML file. You already created this file as part of the earlier instructions.
Example deployment YAML for IP failover configuration
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- The name of the IP failover deployment.
- 2
- The list of IP address ranges to replicate. This must be provided. For example,
1.2.3.4-6,1.2.3.9. - 3
- The number of groups to create for VRRP. If not set, a group is created for each virtual IP range specified with the
OPENSHIFT_HA_VIP_GROUPSvariable. - 4
- The interface name that IP failover uses to send VRRP traffic. By default,
eth0is used. - 5
- The IP failover pod tries to open a TCP connection to this port on each VIP. If connection is established, the service is considered to be running. If this port is set to
0, the test always passes. The default value is80. - 6
- The offset value used to set the virtual router IDs. Using different offset values allows multiple IP failover configurations to exist within the same cluster. The default offset is
0, and the allowed range is0through255. - 7
- The number of replicas to create. This must match
spec.replicasvalue in IP failover deployment configuration. The default value is2. - 8
- The name of the
iptableschain to automatically add aniptablesrule to allow the VRRP traffic on. If the value is not set, aniptablesrule is not added. If the chain does not exist, it is not created, and Keepalived operates in unicast mode. The default isINPUT. - 9
- The full path name in the pod file system of a script that is run whenever the state changes.
- 10
- The full path name in the pod file system of a script that is periodically run to verify the application is operating.
- 11
- The strategy for handling a new higher priority host. The default value is
preempt_delay 300, which causes a Keepalived instance to take over a VIP after 5 minutes if a lower-priority master is holding the VIP. - 12
- The period, in seconds, that the check script is run. The default value is
2. - 13
- Create the pull secret before creating the deployment, otherwise you will get an error when creating the deployment.
13.3. Configuring check and notify scripts Copy linkLink copied to clipboard!
Keepalived monitors the health of the application by periodically running an optional user-supplied check script. For example, the script can test a web server by issuing a request and verifying the response. As cluster administrator, you can provide an optional notify script, which is called whenever the state changes.
The check and notify scripts run in the IP failover pod and use the pod file system, not the host file system. However, the IP failover pod makes the host file system available under the /hosts mount path. When configuring a check or notify script, you must provide the full path to the script. The recommended approach for providing the scripts is to use a ConfigMap object.
The full path names of the check and notify scripts are added to the Keepalived configuration file, _/etc/keepalived/keepalived.conf, which is loaded every time Keepalived starts. The scripts can be added to the pod with a ConfigMap object as described in the following methods.
Check script
When a check script is not provided, a simple default script is run that tests the TCP connection. This default test is suppressed when the monitor port is 0.
Each IP failover pod manages a Keepalived daemon that manages one or more virtual IP (VIP) addresses on the node where the pod is running. The Keepalived daemon keeps the state of each VIP for that node. A particular VIP on a particular node might be in master, backup, or fault state.
If the check script returns non-zero, the node enters the backup state, and any VIPs it holds are reassigned.
Notify script
Keepalived passes the following three parameters to the notify script:
-
$1-grouporinstance -
$2- Name of thegrouporinstance -
$3- The new state:master,backup, orfault
Prerequisites
-
You installed the OpenShift CLI (
oc). -
You are logged in to the cluster with a user with
cluster-adminprivileges.
Procedure
Create the desired script and create a
ConfigMapobject to hold it. The script has no input arguments and must return0forOKand1forfail.The check script,
mycheckscript.sh:#!/bin/bash # Whatever tests are needed # E.g., send request and verify response exit 0#!/bin/bash # Whatever tests are needed # E.g., send request and verify response exit 0Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create the
ConfigMapobject :oc create configmap mycustomcheck --from-file=mycheckscript.sh
$ oc create configmap mycustomcheck --from-file=mycheckscript.shCopy to Clipboard Copied! Toggle word wrap Toggle overflow Add the script to the pod. The
defaultModefor the mountedConfigMapobject files must able to run by usingoccommands or by editing the deployment configuration. A value of0755,493decimal, is typical:oc set env deploy/ipfailover-keepalived \ OPENSHIFT_HA_CHECK_SCRIPT=/etc/keepalive/mycheckscript.sh$ oc set env deploy/ipfailover-keepalived \ OPENSHIFT_HA_CHECK_SCRIPT=/etc/keepalive/mycheckscript.shCopy to Clipboard Copied! Toggle word wrap Toggle overflow oc set volume deploy/ipfailover-keepalived --add --overwrite \ --name=config-volume \ --mount-path=/etc/keepalive \ --source='{"configMap": { "name": "mycustomcheck", "defaultMode": 493}}'$ oc set volume deploy/ipfailover-keepalived --add --overwrite \ --name=config-volume \ --mount-path=/etc/keepalive \ --source='{"configMap": { "name": "mycustomcheck", "defaultMode": 493}}'Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteThe
oc set envcommand is whitespace sensitive. There must be no whitespace on either side of the=sign.TipYou can alternatively edit the
ipfailover-keepaliveddeployment configuration:oc edit deploy ipfailover-keepalived
$ oc edit deploy ipfailover-keepalivedCopy to Clipboard Copied! Toggle word wrap Toggle overflow Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- In the
spec.container.envfield, add theOPENSHIFT_HA_CHECK_SCRIPTenvironment variable to point to the mounted script file. - 2
- Add the
spec.container.volumeMountsfield to create the mount point. - 3
- Add a new
spec.volumesfield to mention the config map. - 4
- This sets run permission on the files. When read back, it is displayed in decimal,
493.
Save the changes and exit the editor. This restarts
ipfailover-keepalived.
13.4. Configuring VRRP preemption Copy linkLink copied to clipboard!
When a Virtual IP (VIP) on a node leaves the fault state by passing the check script, the VIP on the node enters the backup state if it has lower priority than the VIP on the node that is currently in the master state. The nopreempt strategy does not move master from the lower priority VIP on the host to the higher priority VIP on the host. With preempt_delay 300, the default, Keepalived waits the specified 300 seconds and moves master to the higher priority VIP on the host.
Procedure
To specify preemption enter
oc edit deploy ipfailover-keepalivedto edit the router deployment configuration:oc edit deploy ipfailover-keepalived
$ oc edit deploy ipfailover-keepalivedCopy to Clipboard Copied! Toggle word wrap Toggle overflow Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Set the
OPENSHIFT_HA_PREEMPTIONvalue:-
preempt_delay 300: Keepalived waits the specified 300 seconds and movesmasterto the higher priority VIP on the host. This is the default value. -
nopreempt: does not movemasterfrom the lower priority VIP on the host to the higher priority VIP on the host.
-
13.5. Deploying multiple IP failover instances Copy linkLink copied to clipboard!
Each IP failover pod managed by the IP failover deployment configuration, 1 pod per node or replica, runs a Keepalived daemon. As more IP failover deployment configurations are configured, more pods are created and more daemons join into the common Virtual Router Redundancy Protocol (VRRP) negotiation. This negotiation is done by all the Keepalived daemons and it determines which nodes service which virtual IPs (VIP).
Internally, Keepalived assigns a unique vrrp-id to each VIP. The negotiation uses this set of vrrp-ids, when a decision is made, the VIP corresponding to the winning vrrp-id is serviced on the winning node.
Therefore, for every VIP defined in the IP failover deployment configuration, the IP failover pod must assign a corresponding vrrp-id. This is done by starting at OPENSHIFT_HA_VRRP_ID_OFFSET and sequentially assigning the vrrp-ids to the list of VIPs. The vrrp-ids can have values in the range 1..255.
When there are multiple IP failover deployment configurations, you must specify OPENSHIFT_HA_VRRP_ID_OFFSET so that there is room to increase the number of VIPs in the deployment configuration and none of the vrrp-id ranges overlap.
13.6. Configuring IP failover for more than 254 addresses Copy linkLink copied to clipboard!
IP failover management is limited to 254 groups of Virtual IP (VIP) addresses. By default OpenShift Container Platform assigns one IP address to each group. You can use the OPENSHIFT_HA_VIP_GROUPS variable to change this so multiple IP addresses are in each group and define the number of VIP groups available for each Virtual Router Redundancy Protocol (VRRP) instance when configuring IP failover.
Grouping VIPs creates a wider range of allocation of VIPs per VRRP in the case of VRRP failover events, and is useful when all hosts in the cluster have access to a service locally. For example, when a service is being exposed with an ExternalIP.
As a rule for failover, do not limit services, such as the router, to one specific host. Instead, services should be replicated to each host so that in the case of IP failover, the services do not have to be recreated on the new host.
If you are using OpenShift Container Platform health checks, the nature of IP failover and groups means that all instances in the group are not checked. For that reason, the Kubernetes health checks must be used to ensure that services are live.
Prerequisites
-
You are logged in to the cluster with a user with
cluster-adminprivileges.
Procedure
To change the number of IP addresses assigned to each group, change the value for the
OPENSHIFT_HA_VIP_GROUPSvariable, for example:Example
DeploymentYAML for IP failover configurationCopy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- If
OPENSHIFT_HA_VIP_GROUPSis set to3in an environment with seven VIPs, it creates three groups, assigning three VIPs to the first group, and two VIPs to the two remaining groups.
If the number of groups set by OPENSHIFT_HA_VIP_GROUPS is fewer than the number of IP addresses set to fail over, the group contains more than one IP address, and all of the addresses move as a single unit.
13.7. High availability For ExternalIP Copy linkLink copied to clipboard!
In non-cloud clusters, IP failover and ExternalIP to a service can be combined. The result is high availability services for users that create services using ExternalIP.
The approach is to specify an spec.ExternalIP.autoAssignCIDRs range of the cluster network configuration, and then use the same range in creating the IP failover configuration.
Because IP failover can support up to a maximum of 255 VIPs for the entire cluster, the spec.ExternalIP.autoAssignCIDRs must be /24 or smaller.
13.8. Removing IP failover Copy linkLink copied to clipboard!
When IP failover is initially configured, the worker nodes in the cluster are modified with an iptables rule that explicitly allows multicast packets on 224.0.0.18 for Keepalived. Because of the change to the nodes, removing IP failover requires running a job to remove the iptables rule and removing the virtual IP addresses used by Keepalived.
Procedure
Optional: Identify and delete any check and notify scripts that are stored as config maps:
Identify whether any pods for IP failover use a config map as a volume:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Namespace: default Pod: keepalived-worker-59df45db9c-2x9mn Volumes that use config maps: volume: config-volume configMap: mycustomcheck
Namespace: default Pod: keepalived-worker-59df45db9c-2x9mn Volumes that use config maps: volume: config-volume configMap: mycustomcheckCopy to Clipboard Copied! Toggle word wrap Toggle overflow If the preceding step provided the names of config maps that are used as volumes, delete the config maps:
oc delete configmap <configmap_name>
$ oc delete configmap <configmap_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Identify an existing deployment for IP failover:
oc get deployment -l ipfailover
$ oc get deployment -l ipfailoverCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE default ipfailover 2/2 2 2 105d
NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE default ipfailover 2/2 2 2 105dCopy to Clipboard Copied! Toggle word wrap Toggle overflow Delete the deployment:
oc delete deployment <ipfailover_deployment_name>
$ oc delete deployment <ipfailover_deployment_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Remove the
ipfailoverservice account:oc delete sa ipfailover
$ oc delete sa ipfailoverCopy to Clipboard Copied! Toggle word wrap Toggle overflow Run a job that removes the IP tables rule that was added when IP failover was initially configured:
Create a file such as
remove-ipfailover-job.yamlwith contents that are similar to the following example:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Run the job:
oc create -f remove-ipfailover-job.yaml
$ oc create -f remove-ipfailover-job.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
job.batch/remove-ipfailover-2h8dm created
job.batch/remove-ipfailover-2h8dm createdCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Verification
Confirm that the job removed the initial configuration for IP failover.
oc logs job/remove-ipfailover-2h8dm
$ oc logs job/remove-ipfailover-2h8dmCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
remove-failover.sh: OpenShift IP Failover service terminating. - Removing ip_vs module ... - Cleaning up ... - Releasing VIPs (interface eth0) ...
remove-failover.sh: OpenShift IP Failover service terminating. - Removing ip_vs module ... - Cleaning up ... - Releasing VIPs (interface eth0) ...Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Chapter 14. Configuring interface-level network sysctls Copy linkLink copied to clipboard!
In Linux, sysctl allows an administrator to modify kernel parameters at runtime. You can modify interface-level network sysctls using the tuning Container Network Interface (CNI) meta plugin. The tuning CNI meta plugin operates in a chain with a main CNI plugin as illustrated.
The main CNI plugin assigns the interface and passes this to the tuning CNI meta plugin at runtime. You can change some sysctls and several interface attributes (promiscuous mode, all-multicast mode, MTU, and MAC address) in the network namespace by using the tuning CNI meta plugin. In the tuning CNI meta plugin configuration, the interface name is represented by the IFNAME token, and is replaced with the actual name of the interface at runtime.
In OpenShift Container Platform, the tuning CNI meta plugin only supports changing interface-level network sysctls.
14.1. Configuring the tuning CNI Copy linkLink copied to clipboard!
The following procedure configures the tuning CNI to change the interface-level network net.ipv4.conf.IFNAME.accept_redirects sysctl. This example enables accepting and sending ICMP-redirected packets.
Procedure
Create a network attachment definition, such as
tuning-example.yaml, with the following content:Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Specifies the name for the additional network attachment to create. The name must be unique within the specified namespace.
- 2
- Specifies the namespace that the object is associated with.
- 3
- Specifies the CNI specification version.
- 4
- Specifies the name for the configuration. It is recommended to match the configuration name to the name value of the network attachment definition.
- 5
- Specifies the name of the main CNI plugin to configure.
- 6
- Specifies the name of the CNI meta plugin.
- 7
- Specifies the sysctl to set.
An example yaml file is shown here:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Apply the yaml by running the following command:
oc apply -f tuning-example.yaml
$ oc apply -f tuning-example.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
networkattachmentdefinition.k8.cni.cncf.io/tuningnad created
networkattachmentdefinition.k8.cni.cncf.io/tuningnad createdCopy to Clipboard Copied! Toggle word wrap Toggle overflow Create a pod such as
examplepod.yamlwith the network attachment definition similar to the following:Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Specify the name of the configured
NetworkAttachmentDefinition. - 2
runAsUsercontrols which user ID the container is run with.- 3
runAsGroupcontrols which primary group ID the containers is run with.- 4
allowPrivilegeEscalationdetermines if a pod can request to allow privilege escalation. If unspecified, it defaults to true. This boolean directly controls whether theno_new_privsflag gets set on the container process.- 5
capabilitiespermit privileged actions without giving full root access. This policy ensures all capabilities are dropped from the pod.- 6
runAsNonRoot: truerequires that the container will run with a user with any UID other than 0.- 7
RuntimeDefaultenables the default seccomp profile for a pod or container workload.
Apply the yaml by running the following command:
oc apply -f examplepod.yaml
$ oc apply -f examplepod.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Verify that the pod is created by running the following command:
oc get pod
$ oc get podCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
NAME READY STATUS RESTARTS AGE tunepod 1/1 Running 0 47s
NAME READY STATUS RESTARTS AGE tunepod 1/1 Running 0 47sCopy to Clipboard Copied! Toggle word wrap Toggle overflow Log in to the pod by running the following command:
oc rsh tunepod
$ oc rsh tunepodCopy to Clipboard Copied! Toggle word wrap Toggle overflow Verify the values of the configured sysctl flags. For example, find the value
net.ipv4.conf.net1.accept_redirectsby running the following command:sysctl net.ipv4.conf.net1.accept_redirects
sh-4.4# sysctl net.ipv4.conf.net1.accept_redirectsCopy to Clipboard Copied! Toggle word wrap Toggle overflow Expected output
net.ipv4.conf.net1.accept_redirects = 1
net.ipv4.conf.net1.accept_redirects = 1Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Chapter 15. Using the Stream Control Transmission Protocol (SCTP) Copy linkLink copied to clipboard!
As a cluster administrator, you can use the Stream Control Transmission Protocol (SCTP) on a bare-metal cluster.
15.1. Support for SCTP on OpenShift Container Platform Copy linkLink copied to clipboard!
As a cluster administrator, you can enable SCTP on the hosts in the cluster. On Red Hat Enterprise Linux CoreOS (RHCOS), the SCTP module is disabled by default.
SCTP is a reliable message based protocol that runs on top of an IP network.
When enabled, you can use SCTP as a protocol with pods, services, and network policy. A Service object must be defined with the type parameter set to either the ClusterIP or NodePort value.
15.1.1. Example configurations using SCTP protocol Copy linkLink copied to clipboard!
You can configure a pod or service to use SCTP by setting the protocol parameter to the SCTP value in the pod or service object.
In the following example, a pod is configured to use SCTP:
In the following example, a service is configured to use SCTP:
In the following example, a NetworkPolicy object is configured to apply to SCTP network traffic on port 80 from any pods with a specific label:
15.2. Enabling Stream Control Transmission Protocol (SCTP) Copy linkLink copied to clipboard!
As a cluster administrator, you can load and enable the blacklisted SCTP kernel module on worker nodes in your cluster.
Prerequisites
-
Install the OpenShift CLI (
oc). -
Access to the cluster as a user with the
cluster-adminrole.
Procedure
Create a file named
load-sctp-module.yamlthat contains the following YAML definition:Copy to Clipboard Copied! Toggle word wrap Toggle overflow To create the
MachineConfigobject, enter the following command:oc create -f load-sctp-module.yaml
$ oc create -f load-sctp-module.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Optional: To watch the status of the nodes while the MachineConfig Operator applies the configuration change, enter the following command. When the status of a node transitions to
Ready, the configuration update is applied.oc get nodes
$ oc get nodesCopy to Clipboard Copied! Toggle word wrap Toggle overflow
15.3. Verifying Stream Control Transmission Protocol (SCTP) is enabled Copy linkLink copied to clipboard!
You can verify that SCTP is working on a cluster by creating a pod with an application that listens for SCTP traffic, associating it with a service, and then connecting to the exposed service.
Prerequisites
-
Access to the internet from the cluster to install the
ncpackage. -
Install the OpenShift CLI (
oc). -
Access to the cluster as a user with the
cluster-adminrole.
Procedure
Create a pod starts an SCTP listener:
Create a file named
sctp-server.yamlthat defines a pod with the following YAML:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create the pod by entering the following command:
oc create -f sctp-server.yaml
$ oc create -f sctp-server.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Create a service for the SCTP listener pod.
Create a file named
sctp-service.yamlthat defines a service with the following YAML:Copy to Clipboard Copied! Toggle word wrap Toggle overflow To create the service, enter the following command:
oc create -f sctp-service.yaml
$ oc create -f sctp-service.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Create a pod for the SCTP client.
Create a file named
sctp-client.yamlwith the following YAML:Copy to Clipboard Copied! Toggle word wrap Toggle overflow To create the
Podobject, enter the following command:oc apply -f sctp-client.yaml
$ oc apply -f sctp-client.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Run an SCTP listener on the server.
To connect to the server pod, enter the following command:
oc rsh sctpserver
$ oc rsh sctpserverCopy to Clipboard Copied! Toggle word wrap Toggle overflow To start the SCTP listener, enter the following command:
nc -l 30102 --sctp
$ nc -l 30102 --sctpCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Connect to the SCTP listener on the server.
- Open a new terminal window or tab in your terminal program.
Obtain the IP address of the
sctpserviceservice. Enter the following command:oc get services sctpservice -o go-template='{{.spec.clusterIP}}{{"\n"}}'$ oc get services sctpservice -o go-template='{{.spec.clusterIP}}{{"\n"}}'Copy to Clipboard Copied! Toggle word wrap Toggle overflow To connect to the client pod, enter the following command:
oc rsh sctpclient
$ oc rsh sctpclientCopy to Clipboard Copied! Toggle word wrap Toggle overflow To start the SCTP client, enter the following command. Replace
<cluster_IP>with the cluster IP address of thesctpserviceservice.nc <cluster_IP> 30102 --sctp
# nc <cluster_IP> 30102 --sctpCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Chapter 16. Using Precision Time Protocol hardware Copy linkLink copied to clipboard!
You can configure linuxptp services and use PTP-capable hardware in OpenShift Container Platform cluster nodes.
16.1. About PTP hardware Copy linkLink copied to clipboard!
You can use the OpenShift Container Platform console or OpenShift CLI (oc) to install PTP by deploying the PTP Operator. The PTP Operator creates and manages the linuxptp services and provides the following features:
- Discovery of the PTP-capable devices in the cluster.
-
Management of the configuration of
linuxptpservices. -
Notification of PTP clock events that negatively affect the performance and reliability of your application with the PTP Operator
cloud-event-proxysidecar.
The PTP Operator works with PTP-capable devices on clusters provisioned only on bare-metal infrastructure.
16.2. About PTP Copy linkLink copied to clipboard!
Precision Time Protocol (PTP) is used to synchronize clocks in a network. When used in conjunction with hardware support, PTP is capable of sub-microsecond accuracy, and is more accurate than Network Time Protocol (NTP).
The linuxptp package includes the ptp4l and phc2sys programs for clock synchronization. ptp4l implements the PTP boundary clock and ordinary clock. ptp4l synchronizes the PTP hardware clock to the source clock with hardware time stamping and synchronizes the system clock to the source clock with software time stamping. phc2sys is used for hardware time stamping to synchronize the system clock to the PTP hardware clock on the network interface controller (NIC).
16.2.1. Elements of a PTP domain Copy linkLink copied to clipboard!
PTP is used to synchronize multiple nodes connected in a network, with clocks for each node. The clocks synchronized by PTP are organized in a source-destination hierarchy. The hierarchy is created and updated automatically by the best master clock (BMC) algorithm, which runs on every clock. Destination clocks are synchronized to source clocks, and destination clocks can themselves be the source for other downstream clocks. The following types of clocks can be included in configurations:
- Grandmaster clock
- The grandmaster clock provides standard time information to other clocks across the network and ensures accurate and stable synchronisation. It writes time stamps and responds to time requests from other clocks. Grandmaster clocks can be synchronized to a Global Positioning System (GPS) time source.
- Ordinary clock
- The ordinary clock has a single port connection that can play the role of source or destination clock, depending on its position in the network. The ordinary clock can read and write time stamps.
- Boundary clock
- The boundary clock has ports in two or more communication paths and can be a source and a destination to other destination clocks at the same time. The boundary clock works as a destination clock upstream. The destination clock receives the timing message, adjusts for delay, and then creates a new source time signal to pass down the network. The boundary clock produces a new timing packet that is still correctly synced with the source clock and can reduce the number of connected devices reporting directly to the source clock.
16.2.2. Advantages of PTP over NTP Copy linkLink copied to clipboard!
One of the main advantages that PTP has over NTP is the hardware support present in various network interface controllers (NIC) and network switches. The specialized hardware allows PTP to account for delays in message transfer and improves the accuracy of time synchronization. To achieve the best possible accuracy, it is recommended that all networking components between PTP clocks are PTP hardware enabled.
Hardware-based PTP provides optimal accuracy, since the NIC can time stamp the PTP packets at the exact moment they are sent and received. Compare this to software-based PTP, which requires additional processing of the PTP packets by the operating system.
Before enabling PTP, ensure that NTP is disabled for the required nodes. You can disable the chrony time service (chronyd) using a MachineConfig custom resource. For more information, see Disabling chrony time service.
16.2.3. Using PTP with dual NIC hardware Copy linkLink copied to clipboard!
OpenShift Container Platform supports single and dual NIC hardware for precision PTP timing in the cluster.
For 5G telco networks that deliver mid-band spectrum coverage, each virtual distributed unit (vDU) requires connections to 6 radio units (RUs). To make these connections, each vDU host requires 2 NICs configured as boundary clocks.
Dual NIC hardware allows you to connect each NIC to the same upstream leader clock with separate ptp4l instances for each NIC feeding the downstream clocks.
16.3. Installing the PTP Operator using the CLI Copy linkLink copied to clipboard!
As a cluster administrator, you can install the Operator by using the CLI.
Prerequisites
- A cluster installed on bare-metal hardware with nodes that have hardware that supports PTP.
-
Install the OpenShift CLI (
oc). -
Log in as a user with
cluster-adminprivileges.
Procedure
Create a namespace for the PTP Operator.
Save the following YAML in the
ptp-namespace.yamlfile:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create the
NamespaceCR:oc create -f ptp-namespace.yaml
$ oc create -f ptp-namespace.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Create an Operator group for the PTP Operator.
Save the following YAML in the
ptp-operatorgroup.yamlfile:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create the
OperatorGroupCR:oc create -f ptp-operatorgroup.yaml
$ oc create -f ptp-operatorgroup.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Subscribe to the PTP Operator.
Save the following YAML in the
ptp-sub.yamlfile:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create the
SubscriptionCR:oc create -f ptp-sub.yaml
$ oc create -f ptp-sub.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
To verify that the Operator is installed, enter the following command:
oc get csv -n openshift-ptp -o custom-columns=Name:.metadata.name,Phase:.status.phase
$ oc get csv -n openshift-ptp -o custom-columns=Name:.metadata.name,Phase:.status.phaseCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Name Phase 4.12.0-202301261535 Succeeded
Name Phase 4.12.0-202301261535 SucceededCopy to Clipboard Copied! Toggle word wrap Toggle overflow
16.4. Installing the PTP Operator using the web console Copy linkLink copied to clipboard!
As a cluster administrator, you can install the PTP Operator using the web console.
You have to create the namespace and Operator group as mentioned in the previous section.
Procedure
Install the PTP Operator using the OpenShift Container Platform web console:
- In the OpenShift Container Platform web console, click Operators → OperatorHub.
- Choose PTP Operator from the list of available Operators, and then click Install.
- On the Install Operator page, under A specific namespace on the cluster select openshift-ptp. Then, click Install.
Optional: Verify that the PTP Operator installed successfully:
- Switch to the Operators → Installed Operators page.
Ensure that PTP Operator is listed in the openshift-ptp project with a Status of InstallSucceeded.
NoteDuring installation an Operator might display a Failed status. If the installation later succeeds with an InstallSucceeded message, you can ignore the Failed message.
If the Operator does not appear as installed, to troubleshoot further:
- Go to the Operators → Installed Operators page and inspect the Operator Subscriptions and Install Plans tabs for any failure or errors under Status.
-
Go to the Workloads → Pods page and check the logs for pods in the
openshift-ptpproject.
16.5. Configuring PTP devices Copy linkLink copied to clipboard!
The PTP Operator adds the NodePtpDevice.ptp.openshift.io custom resource definition (CRD) to OpenShift Container Platform.
When installed, the PTP Operator searches your cluster for PTP-capable network devices on each node. It creates and updates a NodePtpDevice custom resource (CR) object for each node that provides a compatible PTP-capable network device.
16.5.1. Discovering PTP-capable network devices in your cluster Copy linkLink copied to clipboard!
Identify PTP-capable network devices that exist in your cluster so that you can configure them
Prerequisties
- You installed the PTP Operator.
Procedure
To return a complete list of PTP capable network devices in your cluster, run the following command:
oc get NodePtpDevice -n openshift-ptp -o yaml
$ oc get NodePtpDevice -n openshift-ptp -o yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
16.5.2. Configuring linuxptp services as a grandmaster clock Copy linkLink copied to clipboard!
You can configure the linuxptp services (ptp4l, phc2sys, ts2phc) as grandmaster clock by creating a PtpConfig custom resource (CR) that configures the host NIC.
The ts2phc utility allows you to synchronize the system clock with the PTP grandmaster clock so that the node can stream precision clock signal to downstream PTP ordinary clocks and boundary clocks.
Use the following example PtpConfig CR as the basis to configure linuxptp services as the grandmaster clock for your particular hardware and environment. This example CR does not configure PTP fast events. To configure PTP fast events, set appropriate values for ptp4lOpts, ptp4lConf, and ptpClockThreshold. ptpClockThreshold is used only when events are enabled. See "Configuring the PTP fast event notifications publisher" for more information.
Prerequisites
- Install an Intel Westport Channel network interface in the bare-metal cluster host.
-
Install the OpenShift CLI (
oc). -
Log in as a user with
cluster-adminprivileges. - Install the PTP Operator.
Procedure
Create the
PtpConfigresource. For example:Save the following YAML in the
grandmaster-clock-ptp-config.yamlfile:Example PTP grandmaster clock configuration
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create the CR by running the following command:
oc create -f grandmaster-clock-ptp-config.yaml
$ oc create -f grandmaster-clock-ptp-config.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Verification
Check that the
PtpConfigprofile is applied to the node.Get the list of pods in the
openshift-ptpnamespace by running the following command:oc get pods -n openshift-ptp -o wide
$ oc get pods -n openshift-ptp -o wideCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
NAME READY STATUS RESTARTS AGE IP NODE linuxptp-daemon-74m2g 3/3 Running 3 4d15h 10.16.230.7 compute-1.example.com ptp-operator-5f4f48d7c-x7zkf 1/1 Running 1 4d15h 10.128.1.145 compute-1.example.com
NAME READY STATUS RESTARTS AGE IP NODE linuxptp-daemon-74m2g 3/3 Running 3 4d15h 10.16.230.7 compute-1.example.com ptp-operator-5f4f48d7c-x7zkf 1/1 Running 1 4d15h 10.128.1.145 compute-1.example.comCopy to Clipboard Copied! Toggle word wrap Toggle overflow Check that the profile is correct. Examine the logs of the
linuxptpdaemon that corresponds to the node you specified in thePtpConfigprofile. Run the following command:oc logs linuxptp-daemon-74m2g -n openshift-ptp -c linuxptp-daemon-container
$ oc logs linuxptp-daemon-74m2g -n openshift-ptp -c linuxptp-daemon-containerCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
16.5.3. Configuring linuxptp services as an ordinary clock Copy linkLink copied to clipboard!
You can configure linuxptp services (ptp4l, phc2sys) as ordinary clock by creating a PtpConfig custom resource (CR) object.
Use the following example PtpConfig CR as the basis to configure linuxptp services as an ordinary clock for your particular hardware and environment. This example CR does not configure PTP fast events. To configure PTP fast events, set appropriate values for ptp4lOpts, ptp4lConf, and ptpClockThreshold. ptpClockThreshold is required only when events are enabled. See "Configuring the PTP fast event notifications publisher" for more information.
Prerequisites
-
Install the OpenShift CLI (
oc). -
Log in as a user with
cluster-adminprivileges. - Install the PTP Operator.
Procedure
Create the following
PtpConfigCR, and then save the YAML in theordinary-clock-ptp-config.yamlfile.Example PTP ordinary clock configuration
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Expand Table 16.1. PTP ordinary clock CR configuration options Custom resource field Description nameThe name of the
PtpConfigCR.profileSpecify an array of one or more
profileobjects. Each profile must be uniquely named.interfaceSpecify the network interface to be used by the
ptp4lservice, for exampleens787f1.ptp4lOptsSpecify system config options for the
ptp4lservice, for example-2to select the IEEE 802.3 network transport. The options should not include the network interface name-i <interface>and service config file-f /etc/ptp4l.confbecause the network interface name and the service config file are automatically appended. Append--summary_interval -4to use PTP fast events with this interface.phc2sysOptsSpecify system config options for the
phc2sysservice. If this field is empty, the PTP Operator does not start thephc2sysservice. For Intel Columbiaville 800 Series NICs, setphc2sysOptsoptions to-a -r -m -n 24 -N 8 -R 16.-mprints messages tostdout. Thelinuxptp-daemonDaemonSetparses the logs and generates Prometheus metrics.ptp4lConfSpecify a string that contains the configuration to replace the default
/etc/ptp4l.conffile. To use the default configuration, leave the field empty.tx_timestamp_timeoutFor Intel Columbiaville 800 Series NICs, set
tx_timestamp_timeoutto50.boundary_clock_jbodFor Intel Columbiaville 800 Series NICs, set
boundary_clock_jbodto0.ptpSchedulingPolicyScheduling policy for
ptp4landphc2sysprocesses. Default value isSCHED_OTHER. UseSCHED_FIFOon systems that support FIFO scheduling.ptpSchedulingPriorityInteger value from 1-65 used to set FIFO priority for
ptp4landphc2sysprocesses whenptpSchedulingPolicyis set toSCHED_FIFO. TheptpSchedulingPriorityfield is not used whenptpSchedulingPolicyis set toSCHED_OTHER.ptpClockThresholdOptional. If
ptpClockThresholdis not present, default values are used for theptpClockThresholdfields.ptpClockThresholdconfigures how long after the PTP master clock is disconnected before PTP events are triggered.holdOverTimeoutis the time value in seconds before the PTP clock event state changes toFREERUNwhen the PTP master clock is disconnected. ThemaxOffsetThresholdandminOffsetThresholdsettings configure offset values in nanoseconds that compare against the values forCLOCK_REALTIME(phc2sys) or master offset (ptp4l). When theptp4lorphc2sysoffset value is outside this range, the PTP clock state is set toFREERUN. When the offset value is within this range, the PTP clock state is set toLOCKED.recommendSpecify an array of one or more
recommendobjects that define rules on how theprofileshould be applied to nodes..recommend.profileSpecify the
.recommend.profileobject name defined in theprofilesection..recommend.prioritySet
.recommend.priorityto0for ordinary clock..recommend.matchSpecify
.recommend.matchrules withnodeLabelornodeNamevalues..recommend.match.nodeLabelSet
nodeLabelwith thekeyof thenode.Labelsfield from the node object by using theoc get nodes --show-labelscommand. For example,node-role.kubernetes.io/worker..recommend.match.nodeNameSet
nodeNamewith the value of thenode.Namefield from the node object by using theoc get nodescommand. For example,compute-1.example.com.Create the
PtpConfigCR by running the following command:oc create -f ordinary-clock-ptp-config.yaml
$ oc create -f ordinary-clock-ptp-config.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Verification
Check that the
PtpConfigprofile is applied to the node.Get the list of pods in the
openshift-ptpnamespace by running the following command:oc get pods -n openshift-ptp -o wide
$ oc get pods -n openshift-ptp -o wideCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
NAME READY STATUS RESTARTS AGE IP NODE linuxptp-daemon-4xkbb 1/1 Running 0 43m 10.1.196.24 compute-0.example.com linuxptp-daemon-tdspf 1/1 Running 0 43m 10.1.196.25 compute-1.example.com ptp-operator-657bbb64c8-2f8sj 1/1 Running 0 43m 10.129.0.61 control-plane-1.example.com
NAME READY STATUS RESTARTS AGE IP NODE linuxptp-daemon-4xkbb 1/1 Running 0 43m 10.1.196.24 compute-0.example.com linuxptp-daemon-tdspf 1/1 Running 0 43m 10.1.196.25 compute-1.example.com ptp-operator-657bbb64c8-2f8sj 1/1 Running 0 43m 10.129.0.61 control-plane-1.example.comCopy to Clipboard Copied! Toggle word wrap Toggle overflow Check that the profile is correct. Examine the logs of the
linuxptpdaemon that corresponds to the node you specified in thePtpConfigprofile. Run the following command:oc logs linuxptp-daemon-4xkbb -n openshift-ptp -c linuxptp-daemon-container
$ oc logs linuxptp-daemon-4xkbb -n openshift-ptp -c linuxptp-daemon-containerCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
16.5.4. Configuring linuxptp services as a boundary clock Copy linkLink copied to clipboard!
You can configure the linuxptp services (ptp4l, phc2sys) as boundary clock by creating a PtpConfig custom resource (CR) object.
Use the following example PtpConfig CR as the basis to configure linuxptp services as the boundary clock for your particular hardware and environment. This example CR does not configure PTP fast events. To configure PTP fast events, set appropriate values for ptp4lOpts, ptp4lConf, and ptpClockThreshold. ptpClockThreshold is used only when events are enabled. See "Configuring the PTP fast event notifications publisher" for more information.
Prerequisites
-
Install the OpenShift CLI (
oc). -
Log in as a user with
cluster-adminprivileges. - Install the PTP Operator.
Procedure
Create the following
PtpConfigCR, and then save the YAML in theboundary-clock-ptp-config.yamlfile.Example PTP boundary clock configuration
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Expand Table 16.2. PTP boundary clock CR configuration options Custom resource field Description nameThe name of the
PtpConfigCR.profileSpecify an array of one or more
profileobjects.nameSpecify the name of a profile object which uniquely identifies a profile object.
ptp4lOptsSpecify system config options for the
ptp4lservice. The options should not include the network interface name-i <interface>and service config file-f /etc/ptp4l.confbecause the network interface name and the service config file are automatically appended.ptp4lConfSpecify the required configuration to start
ptp4las boundary clock. For example,ens1f0synchronizes from a grandmaster clock andens1f3synchronizes connected devices.<interface_1>The interface that receives the synchronization clock.
<interface_2>The interface that sends the synchronization clock.
tx_timestamp_timeoutFor Intel Columbiaville 800 Series NICs, set
tx_timestamp_timeoutto50.boundary_clock_jbodFor Intel Columbiaville 800 Series NICs, ensure
boundary_clock_jbodis set to0. For Intel Fortville X710 Series NICs, ensureboundary_clock_jbodis set to1.phc2sysOptsSpecify system config options for the
phc2sysservice. If this field is empty, the PTP Operator does not start thephc2sysservice.ptpSchedulingPolicyScheduling policy for ptp4l and phc2sys processes. Default value is
SCHED_OTHER. UseSCHED_FIFOon systems that support FIFO scheduling.ptpSchedulingPriorityInteger value from 1-65 used to set FIFO priority for
ptp4landphc2sysprocesses whenptpSchedulingPolicyis set toSCHED_FIFO. TheptpSchedulingPriorityfield is not used whenptpSchedulingPolicyis set toSCHED_OTHER.ptpClockThresholdOptional. If
ptpClockThresholdis not present, default values are used for theptpClockThresholdfields.ptpClockThresholdconfigures how long after the PTP master clock is disconnected before PTP events are triggered.holdOverTimeoutis the time value in seconds before the PTP clock event state changes toFREERUNwhen the PTP master clock is disconnected. ThemaxOffsetThresholdandminOffsetThresholdsettings configure offset values in nanoseconds that compare against the values forCLOCK_REALTIME(phc2sys) or master offset (ptp4l). When theptp4lorphc2sysoffset value is outside this range, the PTP clock state is set toFREERUN. When the offset value is within this range, the PTP clock state is set toLOCKED.recommendSpecify an array of one or more
recommendobjects that define rules on how theprofileshould be applied to nodes..recommend.profileSpecify the
.recommend.profileobject name defined in theprofilesection..recommend.prioritySpecify the
prioritywith an integer value between0and99. A larger number gets lower priority, so a priority of99is lower than a priority of10. If a node can be matched with multiple profiles according to rules defined in thematchfield, the profile with the higher priority is applied to that node..recommend.matchSpecify
.recommend.matchrules withnodeLabelornodeNamevalues..recommend.match.nodeLabelSet
nodeLabelwith thekeyof thenode.Labelsfield from the node object by using theoc get nodes --show-labelscommand. For example,node-role.kubernetes.io/worker..recommend.match.nodeNameSet
nodeNamewith the value of thenode.Namefield from the node object by using theoc get nodescommand. For example,compute-1.example.com.Create the CR by running the following command:
oc create -f boundary-clock-ptp-config.yaml
$ oc create -f boundary-clock-ptp-config.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Verification
Check that the
PtpConfigprofile is applied to the node.Get the list of pods in the
openshift-ptpnamespace by running the following command:oc get pods -n openshift-ptp -o wide
$ oc get pods -n openshift-ptp -o wideCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
NAME READY STATUS RESTARTS AGE IP NODE linuxptp-daemon-4xkbb 1/1 Running 0 43m 10.1.196.24 compute-0.example.com linuxptp-daemon-tdspf 1/1 Running 0 43m 10.1.196.25 compute-1.example.com ptp-operator-657bbb64c8-2f8sj 1/1 Running 0 43m 10.129.0.61 control-plane-1.example.com
NAME READY STATUS RESTARTS AGE IP NODE linuxptp-daemon-4xkbb 1/1 Running 0 43m 10.1.196.24 compute-0.example.com linuxptp-daemon-tdspf 1/1 Running 0 43m 10.1.196.25 compute-1.example.com ptp-operator-657bbb64c8-2f8sj 1/1 Running 0 43m 10.129.0.61 control-plane-1.example.comCopy to Clipboard Copied! Toggle word wrap Toggle overflow Check that the profile is correct. Examine the logs of the
linuxptpdaemon that corresponds to the node you specified in thePtpConfigprofile. Run the following command:oc logs linuxptp-daemon-4xkbb -n openshift-ptp -c linuxptp-daemon-container
$ oc logs linuxptp-daemon-4xkbb -n openshift-ptp -c linuxptp-daemon-containerCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
16.5.5. Configuring linuxptp services as boundary clocks for dual NIC hardware Copy linkLink copied to clipboard!
Precision Time Protocol (PTP) hardware with dual NIC configured as boundary clocks is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.
For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.
You can configure the linuxptp services (ptp4l, phc2sys) as boundary clocks for dual NIC hardware by creating a PtpConfig custom resource (CR) object for each NIC.
Dual NIC hardware allows you to connect each NIC to the same upstream leader clock with separate ptp4l instances for each NIC feeding the downstream clocks.
Prerequisites
-
Install the OpenShift CLI (
oc). -
Log in as a user with
cluster-adminprivileges. - Install the PTP Operator.
Procedure
Create two separate
PtpConfigCRs, one for each NIC, using the reference CR in "Configuring linuxptp services as a boundary clock" as the basis for each CR. For example:Create
boundary-clock-ptp-config-nic1.yaml, specifying values forphc2sysOpts:Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Specify the required interfaces to start
ptp4las a boundary clock. For example,ens5f0synchronizes from a grandmaster clock andens5f1synchronizes connected devices. - 2
- Required
phc2sysOptsvalues.-mprints messages tostdout. Thelinuxptp-daemonDaemonSetparses the logs and generates Prometheus metrics.
Create
boundary-clock-ptp-config-nic2.yaml, removing thephc2sysOptsfield altogether to disable thephc2sysservice for the second NIC:Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Specify the required interfaces to start
ptp4las a boundary clock on the second NIC.
NoteYou must completely remove the
phc2sysOptsfield from the secondPtpConfigCR to disable thephc2sysservice on the second NIC.
Create the dual NIC
PtpConfigCRs by running the following commands:Create the CR that configures PTP for the first NIC:
oc create -f boundary-clock-ptp-config-nic1.yaml
$ oc create -f boundary-clock-ptp-config-nic1.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Create the CR that configures PTP for the second NIC:
oc create -f boundary-clock-ptp-config-nic2.yaml
$ oc create -f boundary-clock-ptp-config-nic2.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Verification
Check that the PTP Operator has applied the
PtpConfigCRs for both NICs. Examine the logs for thelinuxptpdaemon corresponding to the node that has the dual NIC hardware installed. For example, run the following command:oc logs linuxptp-daemon-cvgr6 -n openshift-ptp -c linuxptp-daemon-container
$ oc logs linuxptp-daemon-cvgr6 -n openshift-ptp -c linuxptp-daemon-containerCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
ptp4l[80828.335]: [ptp4l.1.config] master offset 5 s2 freq -5727 path delay 519 ptp4l[80828.343]: [ptp4l.0.config] master offset -5 s2 freq -10607 path delay 533 phc2sys[80828.390]: [ptp4l.0.config] CLOCK_REALTIME phc offset 1 s2 freq -87239 delay 539
ptp4l[80828.335]: [ptp4l.1.config] master offset 5 s2 freq -5727 path delay 519 ptp4l[80828.343]: [ptp4l.0.config] master offset -5 s2 freq -10607 path delay 533 phc2sys[80828.390]: [ptp4l.0.config] CLOCK_REALTIME phc offset 1 s2 freq -87239 delay 539Copy to Clipboard Copied! Toggle word wrap Toggle overflow
16.5.6. Intel Columbiaville E800 series NIC as PTP ordinary clock reference Copy linkLink copied to clipboard!
The following table describes the changes that you must make to the reference PTP configuration in order to use Intel Columbiaville E800 series NICs as ordinary clocks. Make the changes in a PtpConfig custom resource (CR) that you apply to the cluster.
| PTP configuration | Recommended setting |
|---|---|
|
|
|
|
|
|
|
|
|
For phc2sysOpts, -m prints messages to stdout. The linuxptp-daemon DaemonSet parses the logs and generates Prometheus metrics.
16.5.7. Configuring FIFO priority scheduling for PTP hardware Copy linkLink copied to clipboard!
In telco or other deployment configurations that require low latency performance, PTP daemon threads run in a constrained CPU footprint alongside the rest of the infrastructure components. By default, PTP threads run with the SCHED_OTHER policy. Under high load, these threads might not get the scheduling latency they require for error-free operation.
To mitigate against potential scheduling latency errors, you can configure the PTP Operator linuxptp services to allow threads to run with a SCHED_FIFO policy. If SCHED_FIFO is set for a PtpConfig CR, then ptp4l and phc2sys will run in the parent container under chrt with a priority set by the ptpSchedulingPriority field of the PtpConfig CR.
Setting ptpSchedulingPolicy is optional, and is only required if you are experiencing latency errors.
Procedure
Edit the
PtpConfigCR profile:oc edit PtpConfig -n openshift-ptp
$ oc edit PtpConfig -n openshift-ptpCopy to Clipboard Copied! Toggle word wrap Toggle overflow Change the
ptpSchedulingPolicyandptpSchedulingPriorityfields:Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
Save and exit to apply the changes to the
PtpConfigCR.
Verification
Get the name of the
linuxptp-daemonpod and corresponding node where thePtpConfigCR has been applied:oc get pods -n openshift-ptp -o wide
$ oc get pods -n openshift-ptp -o wideCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
NAME READY STATUS RESTARTS AGE IP NODE linuxptp-daemon-gmv2n 3/3 Running 0 1d17h 10.1.196.24 compute-0.example.com linuxptp-daemon-lgm55 3/3 Running 0 1d17h 10.1.196.25 compute-1.example.com ptp-operator-3r4dcvf7f4-zndk7 1/1 Running 0 1d7h 10.129.0.61 control-plane-1.example.com
NAME READY STATUS RESTARTS AGE IP NODE linuxptp-daemon-gmv2n 3/3 Running 0 1d17h 10.1.196.24 compute-0.example.com linuxptp-daemon-lgm55 3/3 Running 0 1d17h 10.1.196.25 compute-1.example.com ptp-operator-3r4dcvf7f4-zndk7 1/1 Running 0 1d7h 10.129.0.61 control-plane-1.example.comCopy to Clipboard Copied! Toggle word wrap Toggle overflow Check that the
ptp4lprocess is running with the updatedchrtFIFO priority:oc -n openshift-ptp logs linuxptp-daemon-lgm55 -c linuxptp-daemon-container|grep chrt
$ oc -n openshift-ptp logs linuxptp-daemon-lgm55 -c linuxptp-daemon-container|grep chrtCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
I1216 19:24:57.091872 1600715 daemon.go:285] /bin/chrt -f 65 /usr/sbin/ptp4l -f /var/run/ptp4l.0.config -2 --summary_interval -4 -m
I1216 19:24:57.091872 1600715 daemon.go:285] /bin/chrt -f 65 /usr/sbin/ptp4l -f /var/run/ptp4l.0.config -2 --summary_interval -4 -mCopy to Clipboard Copied! Toggle word wrap Toggle overflow
16.5.8. Configuring log filtering for linuxptp services Copy linkLink copied to clipboard!
The linuxptp daemon generates logs that you can use for debugging purposes. In telco or other deployment configurations that feature a limited storage capacity, these logs can add to the storage demand.
To reduce the number log messages, you can configure the PtpConfig custom resource (CR) to exclude log messages that report the master offset value. The master offset log message reports the difference between the current node’s clock and the master clock in nanoseconds.
Prerequisites
-
Install the OpenShift CLI (
oc). -
Log in as a user with
cluster-adminprivileges. - Install the PTP Operator.
Procedure
Edit the
PtpConfigCR:oc edit PtpConfig -n openshift-ptp
$ oc edit PtpConfig -n openshift-ptpCopy to Clipboard Copied! Toggle word wrap Toggle overflow In
spec.profile, add theptpSettings.logReducespecification and set the value totrue:Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteFor debugging purposes, you can revert this specification to
Falseto include the master offset messages.-
Save and exit to apply the changes to the
PtpConfigCR.
Verification
Get the name of the
linuxptp-daemonpod and corresponding node where thePtpConfigCR has been applied:oc get pods -n openshift-ptp -o wide
$ oc get pods -n openshift-ptp -o wideCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
NAME READY STATUS RESTARTS AGE IP NODE linuxptp-daemon-gmv2n 3/3 Running 0 1d17h 10.1.196.24 compute-0.example.com linuxptp-daemon-lgm55 3/3 Running 0 1d17h 10.1.196.25 compute-1.example.com ptp-operator-3r4dcvf7f4-zndk7 1/1 Running 0 1d7h 10.129.0.61 control-plane-1.example.com
NAME READY STATUS RESTARTS AGE IP NODE linuxptp-daemon-gmv2n 3/3 Running 0 1d17h 10.1.196.24 compute-0.example.com linuxptp-daemon-lgm55 3/3 Running 0 1d17h 10.1.196.25 compute-1.example.com ptp-operator-3r4dcvf7f4-zndk7 1/1 Running 0 1d7h 10.129.0.61 control-plane-1.example.comCopy to Clipboard Copied! Toggle word wrap Toggle overflow Verify that master offset messages are excluded from the logs by running the following command:
oc -n openshift-ptp logs <linux_daemon_container> -c linuxptp-daemon-container | grep "master offset"
$ oc -n openshift-ptp logs <linux_daemon_container> -c linuxptp-daemon-container | grep "master offset"1 Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- <linux_daemon_container> is the name of the
linuxptp-daemonpod, for examplelinuxptp-daemon-gmv2n.
When you configure the
logReducespecification, this command does not report any instances ofmaster offsetin the logs of thelinuxptpdaemon.
16.6. Troubleshooting common PTP Operator issues Copy linkLink copied to clipboard!
Troubleshoot common problems with the PTP Operator by performing the following steps.
Prerequisites
-
Install the OpenShift Container Platform CLI (
oc). -
Log in as a user with
cluster-adminprivileges. - Install the PTP Operator on a bare-metal cluster with hosts that support PTP.
Procedure
Check the Operator and operands are successfully deployed in the cluster for the configured nodes.
oc get pods -n openshift-ptp -o wide
$ oc get pods -n openshift-ptp -o wideCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
NAME READY STATUS RESTARTS AGE IP NODE linuxptp-daemon-lmvgn 3/3 Running 0 4d17h 10.1.196.24 compute-0.example.com linuxptp-daemon-qhfg7 3/3 Running 0 4d17h 10.1.196.25 compute-1.example.com ptp-operator-6b8dcbf7f4-zndk7 1/1 Running 0 5d7h 10.129.0.61 control-plane-1.example.com
NAME READY STATUS RESTARTS AGE IP NODE linuxptp-daemon-lmvgn 3/3 Running 0 4d17h 10.1.196.24 compute-0.example.com linuxptp-daemon-qhfg7 3/3 Running 0 4d17h 10.1.196.25 compute-1.example.com ptp-operator-6b8dcbf7f4-zndk7 1/1 Running 0 5d7h 10.129.0.61 control-plane-1.example.comCopy to Clipboard Copied! Toggle word wrap Toggle overflow NoteWhen the PTP fast event bus is enabled, the number of ready
linuxptp-daemonpods is3/3. If the PTP fast event bus is not enabled,2/2is displayed.Check that supported hardware is found in the cluster.
oc -n openshift-ptp get nodeptpdevices.ptp.openshift.io
$ oc -n openshift-ptp get nodeptpdevices.ptp.openshift.ioCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Check the available PTP network interfaces for a node:
oc -n openshift-ptp get nodeptpdevices.ptp.openshift.io <node_name> -o yaml
$ oc -n openshift-ptp get nodeptpdevices.ptp.openshift.io <node_name> -o yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow where:
- <node_name>
Specifies the node you want to query, for example,
compute-0.example.com.Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Check that the PTP interface is successfully synchronized to the primary clock by accessing the
linuxptp-daemonpod for the corresponding node.Get the name of the
linuxptp-daemonpod and corresponding node you want to troubleshoot by running the following command:oc get pods -n openshift-ptp -o wide
$ oc get pods -n openshift-ptp -o wideCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
NAME READY STATUS RESTARTS AGE IP NODE linuxptp-daemon-lmvgn 3/3 Running 0 4d17h 10.1.196.24 compute-0.example.com linuxptp-daemon-qhfg7 3/3 Running 0 4d17h 10.1.196.25 compute-1.example.com ptp-operator-6b8dcbf7f4-zndk7 1/1 Running 0 5d7h 10.129.0.61 control-plane-1.example.com
NAME READY STATUS RESTARTS AGE IP NODE linuxptp-daemon-lmvgn 3/3 Running 0 4d17h 10.1.196.24 compute-0.example.com linuxptp-daemon-qhfg7 3/3 Running 0 4d17h 10.1.196.25 compute-1.example.com ptp-operator-6b8dcbf7f4-zndk7 1/1 Running 0 5d7h 10.129.0.61 control-plane-1.example.comCopy to Clipboard Copied! Toggle word wrap Toggle overflow Remote shell into the required
linuxptp-daemoncontainer:oc rsh -n openshift-ptp -c linuxptp-daemon-container <linux_daemon_container>
$ oc rsh -n openshift-ptp -c linuxptp-daemon-container <linux_daemon_container>Copy to Clipboard Copied! Toggle word wrap Toggle overflow where:
- <linux_daemon_container>
-
is the container you want to diagnose, for example
linuxptp-daemon-lmvgn.
In the remote shell connection to the
linuxptp-daemoncontainer, use the PTP Management Client (pmc) tool to diagnose the network interface. Run the followingpmccommand to check the sync status of the PTP device, for exampleptp4l.pmc -u -f /var/run/ptp4l.0.config -b 0 'GET PORT_DATA_SET'
# pmc -u -f /var/run/ptp4l.0.config -b 0 'GET PORT_DATA_SET'Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example output when the node is successfully synced to the primary clock
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
16.6.1. Collecting Precision Time Protocol (PTP) Operator data Copy linkLink copied to clipboard!
You can use the oc adm must-gather CLI command to collect information about your cluster, including features and objects associated with Precision Time Protocol (PTP) Operator.
Prerequisites
-
You have access to the cluster as a user with the
cluster-adminrole. -
You have installed the OpenShift CLI (
oc). - You have installed the PTP Operator.
Procedure
To collect PTP Operator data with
must-gather, you must specify the PTP Operatormust-gatherimage.oc adm must-gather --image=registry.redhat.io/openshift4/ptp-must-gather-rhel8:v4.12
$ oc adm must-gather --image=registry.redhat.io/openshift4/ptp-must-gather-rhel8:v4.12Copy to Clipboard Copied! Toggle word wrap Toggle overflow
16.7. PTP hardware fast event notifications framework Copy linkLink copied to clipboard!
Cloud native applications such as virtual RAN (vRAN) require access to notifications about hardware timing events that are critical to the functioning of the overall network. PTP clock synchronization errors can negatively affect the performance and reliability of your low-latency application, for example, a vRAN application running in a distributed unit (DU).
16.7.1. About PTP and clock synchronization error events Copy linkLink copied to clipboard!
Loss of PTP synchronization is a critical error for a RAN network. If synchronization is lost on a node, the radio might be shut down and the network Over the Air (OTA) traffic might be shifted to another node in the wireless network. Fast event notifications mitigate against workload errors by allowing cluster nodes to communicate PTP clock sync status to the vRAN application running in the DU.
Event notifications are available to vRAN applications running on the same DU node. A publish-subscribe REST API passes events notifications to the messaging bus. Publish-subscribe messaging, or pub-sub messaging, is an asynchronous service-to-service communication architecture where any message published to a topic is immediately received by all of the subscribers to the topic.
The PTP Operator generates fast event notifications for every PTP-capable network interface. You can access the events by using a cloud-event-proxy sidecar container over an HTTP or Advanced Message Queuing Protocol (AMQP) message bus.
PTP fast event notifications are available for network interfaces configured to use PTP ordinary clocks or PTP boundary clocks.
HTTP transport is the default transport for PTP and bare-metal events. Use HTTP transport instead of AMQP for PTP and bare-metal events where possible. AMQ Interconnect is EOL from 30 June 2024. Extended life cycle support (ELS) for AMQ Interconnect ends 29 November 2029. For more information see, Red Hat AMQ Interconnect support status.
16.7.2. About the PTP fast event notifications framework Copy linkLink copied to clipboard!
Use the Precision Time Protocol (PTP) fast event notifications framework to subscribe cluster applications to PTP events that the bare-metal cluster node generates.
The fast events notifications framework uses a REST API for communication. The REST API is based on the O-RAN O-Cloud Notification API Specification for Event Consumers 3.0 that is available from O-RAN ALLIANCE Specifications.
The framework consists of a publisher, subscriber, and an AMQ or HTTP messaging protocol to handle communications between the publisher and subscriber applications. Applications run the cloud-event-proxy container in a sidecar pattern to subscribe to PTP events. The cloud-event-proxy sidecar container can access the same resources as the primary application container without using any of the resources of the primary application and with no significant latency.
HTTP transport is the default transport for PTP and bare-metal events. Use HTTP transport instead of AMQP for PTP and bare-metal events where possible. AMQ Interconnect is EOL from 30 June 2024. Extended life cycle support (ELS) for AMQ Interconnect ends 29 November 2029. For more information see, Red Hat AMQ Interconnect support status.
Figure 16.1. Overview of PTP fast events
-
Event is generated on the cluster host -
linuxptp-daemonin the PTP Operator-managed pod runs as a KubernetesDaemonSetand manages the variouslinuxptpprocesses (ptp4l,phc2sys, and optionally for grandmaster clocks,ts2phc). Thelinuxptp-daemonpasses the event to the UNIX domain socket. -
Event is passed to the cloud-event-proxy sidecar -
The PTP plugin reads the event from the UNIX domain socket and passes it to the
cloud-event-proxysidecar in the PTP Operator-managed pod.cloud-event-proxydelivers the event from the Kubernetes infrastructure to Cloud-Native Network Functions (CNFs) with low latency. -
Event is persisted -
The
cloud-event-proxysidecar in the PTP Operator-managed pod processes the event and publishes the cloud-native event by using a REST API. -
Message is transported -
The message transporter transports the event to the
cloud-event-proxysidecar in the application pod over HTTP or AMQP 1.0 QPID. -
Event is available from the REST API -
The
cloud-event-proxysidecar in the Application pod processes the event and makes it available by using the REST API. -
Consumer application requests a subscription and receives the subscribed event -
The consumer application sends an API request to the
cloud-event-proxysidecar in the application pod to create a PTP events subscription. Thecloud-event-proxysidecar creates an AMQ or HTTP messaging listener protocol for the resource specified in the subscription.
The cloud-event-proxy sidecar in the application pod receives the event from the PTP Operator-managed pod, unwraps the cloud events object to retrieve the data, and posts the event to the consumer application. The consumer application listens to the address specified in the resource qualifier and receives and processes the PTP event.
16.7.3. Configuring the PTP fast event notifications publisher Copy linkLink copied to clipboard!
To start using PTP fast event notifications for a network interface in your cluster, you must enable the fast event publisher in the PTP Operator PtpOperatorConfig custom resource (CR) and configure ptpClockThreshold values in a PtpConfig CR that you create.
Prerequisites
-
You have installed the OpenShift Container Platform CLI (
oc). -
You have logged in as a user with
cluster-adminprivileges. - You have installed the PTP Operator.
Procedure
Modify the default PTP Operator config to enable PTP fast events.
Save the following YAML in the
ptp-operatorconfig.yamlfile:Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Set
enableEventPublishertotrueto enable PTP fast event notifications.
NoteIn OpenShift Container Platform 4.12 or later, you do not need to set the
spec.ptpEventConfig.transportHostfield in thePtpOperatorConfigresource when you use HTTP transport for PTP events. SettransportHostonly when you use AMQP transport for PTP events.Update the
PtpOperatorConfigCR:oc apply -f ptp-operatorconfig.yaml
$ oc apply -f ptp-operatorconfig.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Create a
PtpConfigcustom resource (CR) for the PTP enabled interface, and set the required values forptpClockThresholdandptp4lOpts. The following YAML illustrates the required values that you must set in thePtpConfigCR:Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Append
--summary_interval -4to use PTP fast events. - 2
- Required
phc2sysOptsvalues.-mprints messages tostdout. Thelinuxptp-daemonDaemonSetparses the logs and generates Prometheus metrics. - 3
- Specify a string that contains the configuration to replace the default
/etc/ptp4l.conffile. To use the default configuration, leave the field empty. - 4
- Optional. If the
ptpClockThresholdstanza is not present, default values are used for theptpClockThresholdfields. The stanza shows defaultptpClockThresholdvalues. TheptpClockThresholdvalues configure how long after the PTP master clock is disconnected before PTP events are triggered.holdOverTimeoutis the time value in seconds before the PTP clock event state changes toFREERUNwhen the PTP master clock is disconnected. ThemaxOffsetThresholdandminOffsetThresholdsettings configure offset values in nanoseconds that compare against the values forCLOCK_REALTIME(phc2sys) or master offset (ptp4l). When theptp4lorphc2sysoffset value is outside this range, the PTP clock state is set toFREERUN. When the offset value is within this range, the PTP clock state is set toLOCKED.
16.7.4. Migrating consumer applications to use HTTP transport for PTP or bare-metal events Copy linkLink copied to clipboard!
If you have previously deployed PTP or bare-metal events consumer applications, you need to update the applications to use HTTP message transport.
Prerequisites
-
You have installed the OpenShift CLI (
oc). -
You have logged in as a user with
cluster-adminprivileges. - You have updated the PTP Operator or Bare Metal Event Relay to version 4.12 or later which uses HTTP transport by default.
Procedure
Update your events consumer application to use HTTP transport. Set the
http-event-publishersvariable for the cloud event sidecar deployment.For example, in a cluster with PTP events configured, the following YAML snippet illustrates a cloud event sidecar deployment:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- The PTP Operator automatically resolves
NODE_NAMEto the host that is generating the PTP events. For example,compute-1.example.com.
In a cluster with bare-metal events configured, set the
http-event-publishersfield tohw-event-publisher-service.openshift-bare-metal-events.svc.cluster.local:9043in the cloud event sidecar deployment CR.Deploy the
consumer-events-subscription-serviceservice alongside the events consumer application. For example:Copy to Clipboard Copied! Toggle word wrap Toggle overflow
16.7.5. Installing the AMQ messaging bus Copy linkLink copied to clipboard!
To pass PTP fast event notifications between publisher and subscriber on a node, you can install and configure an AMQ messaging bus to run locally on the node. To use AMQ messaging, you must install the AMQ Interconnect Operator.
HTTP transport is the default transport for PTP and bare-metal events. Use HTTP transport instead of AMQP for PTP and bare-metal events where possible. AMQ Interconnect is EOL from 30 June 2024. Extended life cycle support (ELS) for AMQ Interconnect ends 29 November 2029. For more information see, Red Hat AMQ Interconnect support status.
Prerequisites
-
Install the OpenShift Container Platform CLI (
oc). -
Log in as a user with
cluster-adminprivileges.
Procedure
-
Install the AMQ Interconnect Operator to its own
amq-interconnectnamespace. See Adding the Red Hat Integration - AMQ Interconnect Operator.
Verification
Check that the AMQ Interconnect Operator is available and the required pods are running:
oc get pods -n amq-interconnect
$ oc get pods -n amq-interconnectCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
NAME READY STATUS RESTARTS AGE amq-interconnect-645db76c76-k8ghs 1/1 Running 0 23h interconnect-operator-5cb5fc7cc-4v7qm 1/1 Running 0 23h
NAME READY STATUS RESTARTS AGE amq-interconnect-645db76c76-k8ghs 1/1 Running 0 23h interconnect-operator-5cb5fc7cc-4v7qm 1/1 Running 0 23hCopy to Clipboard Copied! Toggle word wrap Toggle overflow Check that the required
linuxptp-daemonPTP event producer pods are running in theopenshift-ptpnamespace.oc get pods -n openshift-ptp
$ oc get pods -n openshift-ptpCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
NAME READY STATUS RESTARTS AGE linuxptp-daemon-2t78p 3/3 Running 0 12h linuxptp-daemon-k8n88 3/3 Running 0 12h
NAME READY STATUS RESTARTS AGE linuxptp-daemon-2t78p 3/3 Running 0 12h linuxptp-daemon-k8n88 3/3 Running 0 12hCopy to Clipboard Copied! Toggle word wrap Toggle overflow
16.7.6. Subscribing DU applications to PTP events REST API reference Copy linkLink copied to clipboard!
Use the PTP event notifications REST API to subscribe a distributed unit (DU) application to the PTP events that are generated on the parent node.
Subscribe applications to PTP events by using the resource address /cluster/node/<node_name>/ptp, where <node_name> is the cluster node running the DU application.
Deploy your cloud-event-consumer DU application container and cloud-event-proxy sidecar container in a separate DU application pod. The cloud-event-consumer DU application subscribes to the cloud-event-proxy container in the application pod.
Use the following API endpoints to subscribe the cloud-event-consumer DU application to PTP events posted by the cloud-event-proxy container at http://localhost:8089/api/ocloudNotifications/v1/ in the DU application pod:
/api/ocloudNotifications/v1/subscriptions-
POST: Creates a new subscription -
GET: Retrieves a list of subscriptions -
DELETE: Deletes all subscriptions
-
/api/ocloudNotifications/v1/subscriptions/<subscription_id>-
GET: Returns details for the specified subscription ID -
DELETE: Deletes the subscription associated with the specified subscription ID
-
/api/ocloudNotifications/v1/health-
GET: Returns the health status ofocloudNotificationsAPI
-
api/ocloudNotifications/v1/publishers-
GET: Returns an array ofos-clock-sync-state,ptp-clock-class-change, andlock-statemessages for the cluster node
-
/api/ocloudnotifications/v1/{resource_address}/CurrentState-
GET: Returns the current state of one the following event types:os-clock-sync-state,ptp-clock-class-change, orlock-stateevents
-
9089 is the default port for the cloud-event-consumer container deployed in the application pod. You can configure a different port for your DU application as required.
16.7.6.1. api/ocloudNotifications/v1/subscriptions Copy linkLink copied to clipboard!
HTTP method
GET api/ocloudNotifications/v1/subscriptions
Description
Returns a list of subscriptions. If subscriptions exist, a 200 OK status code is returned along with the list of subscriptions.
Example API response
HTTP method
POST api/ocloudNotifications/v1/subscriptions
Description
Creates a new subscription. If a subscription is successfully created, or if it already exists, a 201 Created status code is returned.
| Parameter | Type |
|---|---|
| subscription | data |
Example payload
{
"uriLocation": "http://localhost:8089/api/ocloudNotifications/v1/subscriptions",
"resource": "/cluster/node/compute-1.example.com/ptp"
}
{
"uriLocation": "http://localhost:8089/api/ocloudNotifications/v1/subscriptions",
"resource": "/cluster/node/compute-1.example.com/ptp"
}
HTTP method
DELETE api/ocloudNotifications/v1/subscriptions
Description
Deletes all subscriptions.
Example API response
{
"status": "deleted all subscriptions"
}
{
"status": "deleted all subscriptions"
}
16.7.6.2. api/ocloudNotifications/v1/subscriptions/{subscription_id} Copy linkLink copied to clipboard!
HTTP method
GET api/ocloudNotifications/v1/subscriptions/{subscription_id}
Description
Returns details for the subscription with ID subscription_id.
| Parameter | Type |
|---|---|
|
| string |
Example API response
HTTP method
DELETE api/ocloudNotifications/v1/subscriptions/{subscription_id}
Description
Deletes the subscription with ID subscription_id.
| Parameter | Type |
|---|---|
|
| string |
Example API response
{
"status": "OK"
}
{
"status": "OK"
}
16.7.6.3. api/ocloudNotifications/v1/health Copy linkLink copied to clipboard!
HTTP method
GET api/ocloudNotifications/v1/health/
Description
Returns the health status for the ocloudNotifications REST API.
Example API response
OK
OK
16.7.6.4. api/ocloudNotifications/v1/publishers Copy linkLink copied to clipboard!
HTTP method
GET api/ocloudNotifications/v1/publishers
Description
Returns an array of os-clock-sync-state, ptp-clock-class-change, and lock-state details for the cluster node. The system generates notifications when the relevant equipment state changes.
-
os-clock-sync-statenotifications describe the host operating system clock synchronization state. Can be inLOCKEDorFREERUNstate. -
ptp-clock-class-changenotifications describe the current state of the PTP clock class. -
lock-statenotifications describe the current status of the PTP equipment lock state. Can be inLOCKED,HOLDOVERorFREERUNstate.
Example API response
You can find os-clock-sync-state, ptp-clock-class-change and lock-state events in the logs for the cloud-event-proxy container. For example:
oc logs -f linuxptp-daemon-cvgr6 -n openshift-ptp -c cloud-event-proxy
$ oc logs -f linuxptp-daemon-cvgr6 -n openshift-ptp -c cloud-event-proxy
Example os-clock-sync-state event
Example ptp-clock-class-change event
Example lock-state event
16.7.6.5. /api/ocloudnotifications/v1/{resource_address}/CurrentState Copy linkLink copied to clipboard!
HTTP method
GET api/ocloudNotifications/v1/cluster/node/<node_name>/sync/ptp-status/lock-state/CurrentState
GET api/ocloudNotifications/v1/cluster/node/<node_name>/sync/sync-status/os-clock-sync-state/CurrentState
GET api/ocloudNotifications/v1/cluster/node/<node_name>/sync/ptp-status/ptp-clock-class-change/CurrentState
Description
Configure the CurrentState API endpoint to return the current state of the os-clock-sync-state, ptp-clock-class-change, or lock-state events for the cluster node.
-
os-clock-sync-statenotifications describe the host operating system clock synchronization state. Can be inLOCKEDorFREERUNstate. -
ptp-clock-class-changenotifications describe the current state of the PTP clock class. -
lock-statenotifications describe the current status of the PTP equipment lock state. Can be inLOCKED,HOLDOVERorFREERUNstate.
| Parameter | Type |
|---|---|
|
| string |
Example lock-state API response
Example os-clock-sync-state API response
Example ptp-clock-class-change API response
16.7.7. Monitoring PTP fast event metrics Copy linkLink copied to clipboard!
You can monitor PTP fast events metrics from cluster nodes where the linuxptp-daemon is running. You can also monitor PTP fast event metrics in the OpenShift Container Platform web console by using the preconfigured and self-updating Prometheus monitoring stack.
Prerequisites
-
Install the OpenShift Container Platform CLI
oc. -
Log in as a user with
cluster-adminprivileges. - Install and configure the PTP Operator on a node with PTP-capable hardware.
Procedure
Check for exposed PTP metrics on any node where the
linuxptp-daemonis running. For example, run the following command:curl http://<node_name>:9091/metrics
$ curl http://<node_name>:9091/metricsCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
To view the PTP event in the OpenShift Container Platform web console, copy the name of the PTP metric you want to query, for example,
openshift_ptp_offset_ns. - In the OpenShift Container Platform web console, click Observe → Metrics.
- Paste the PTP metric name into the Expression field, and click Run queries.
Chapter 17. External DNS Operator Copy linkLink copied to clipboard!
17.1. External DNS Operator in OpenShift Container Platform Copy linkLink copied to clipboard!
The External DNS Operator deploys and manages ExternalDNS to provide the name resolution for services and routes from the external DNS provider to OpenShift Container Platform.
17.1.1. External DNS Operator Copy linkLink copied to clipboard!
The External DNS Operator implements the External DNS API from the olm.openshift.io API group. The External DNS Operator updates services, routes, and external DNS providers.
Prerequisites
-
You have installed the
yqCLI tool.
Procedure
You can deploy the External DNS Operator on demand from the OperatorHub. Deploying the External DNS Operator creates a Subscription object.
Check the name of an install plan by running the following command:
oc -n external-dns-operator get sub external-dns-operator -o yaml | yq '.status.installplan.name'
$ oc -n external-dns-operator get sub external-dns-operator -o yaml | yq '.status.installplan.name'Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
install-zcvlr
install-zcvlrCopy to Clipboard Copied! Toggle word wrap Toggle overflow Check if the status of an install plan is
Completeby running the following command:oc -n external-dns-operator get ip <install_plan_name> -o yaml | yq '.status.phase'
$ oc -n external-dns-operator get ip <install_plan_name> -o yaml | yq '.status.phase'Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Complete
CompleteCopy to Clipboard Copied! Toggle word wrap Toggle overflow View the status of the
external-dns-operatordeployment by running the following command:oc get -n external-dns-operator deployment/external-dns-operator
$ oc get -n external-dns-operator deployment/external-dns-operatorCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
NAME READY UP-TO-DATE AVAILABLE AGE external-dns-operator 1/1 1 1 23h
NAME READY UP-TO-DATE AVAILABLE AGE external-dns-operator 1/1 1 1 23hCopy to Clipboard Copied! Toggle word wrap Toggle overflow
17.1.2. External DNS Operator logs Copy linkLink copied to clipboard!
You can view External DNS Operator logs by using the oc logs command.
Procedure
View the logs of the External DNS Operator by running the following command:
oc logs -n external-dns-operator deployment/external-dns-operator -c external-dns-operator
$ oc logs -n external-dns-operator deployment/external-dns-operator -c external-dns-operatorCopy to Clipboard Copied! Toggle word wrap Toggle overflow
17.1.2.1. External DNS Operator domain name limitations Copy linkLink copied to clipboard!
The External DNS Operator uses the TXT registry which adds the prefix for TXT records. This reduces the maximum length of the domain name for TXT records. A DNS record cannot be present without a corresponding TXT record, so the domain name of the DNS record must follow the same limit as the TXT records. For example, a DNS record of <domain_name_from_source> results in a TXT record of external-dns-<record_type>-<domain_name_from_source>.
The domain name of the DNS records generated by the External DNS Operator has the following limitations:
| Record type | Number of characters |
|---|---|
| CNAME | 44 |
| Wildcard CNAME records on AzureDNS | 42 |
| A | 48 |
| Wildcard A records on AzureDNS | 46 |
The following error appears in the External DNS Operator logs if the generated domain name exceeds any of the domain name limitations:
time="2022-09-02T08:53:57Z" level=error msg="Failure in zone test.example.io. [Id: /hostedzone/Z06988883Q0H0RL6UMXXX]" time="2022-09-02T08:53:57Z" level=error msg="InvalidChangeBatch: [FATAL problem: DomainLabelTooLong (Domain label is too long) encountered with 'external-dns-a-hello-openshift-aaaaaaaaaa-bbbbbbbbbb-ccccccc']\n\tstatus code: 400, request id: e54dfd5a-06c6-47b0-bcb9-a4f7c3a4e0c6"
time="2022-09-02T08:53:57Z" level=error msg="Failure in zone test.example.io. [Id: /hostedzone/Z06988883Q0H0RL6UMXXX]"
time="2022-09-02T08:53:57Z" level=error msg="InvalidChangeBatch: [FATAL problem: DomainLabelTooLong (Domain label is too long) encountered with 'external-dns-a-hello-openshift-aaaaaaaaaa-bbbbbbbbbb-ccccccc']\n\tstatus code: 400, request id: e54dfd5a-06c6-47b0-bcb9-a4f7c3a4e0c6"
17.2. Installing External DNS Operator on cloud providers Copy linkLink copied to clipboard!
You can install the External DNS Operator on cloud providers such as AWS, Azure, and Google Cloud.
17.2.1. Installing the External DNS Operator with OperatorHub Copy linkLink copied to clipboard!
You can install the External DNS Operator by using the OpenShift Container Platform OperatorHub.
Procedure
- Click Operators → OperatorHub in the OpenShift Container Platform web console.
- Click External DNS Operator. You can use the Filter by keyword text box or the filter list to search for External DNS Operator from the list of Operators.
-
Select the
external-dns-operatornamespace. - On the External DNS Operator page, click Install.
On the Install Operator page, ensure that you selected the following options:
- Update the channel as stable-v1.
- Installation mode as A specific name on the cluster.
-
Installed namespace as
external-dns-operator. If namespaceexternal-dns-operatordoes not exist, it gets created during the Operator installation. - Select Approval Strategy as Automatic or Manual. Approval Strategy is set to Automatic by default.
- Click Install.
If you select Automatic updates, the Operator Lifecycle Manager (OLM) automatically upgrades the running instance of your Operator without any intervention.
If you select Manual updates, the OLM creates an update request. As a cluster administrator, you must then manually approve that update request to have the Operator updated to the new version.
Verification
Verify that the External DNS Operator shows the Status as Succeeded on the Installed Operators dashboard.
17.2.2. Installing the External DNS Operator by using the CLI Copy linkLink copied to clipboard!
You can install the External DNS Operator by using the CLI.
Prerequisites
-
You are logged in to the OpenShift Container Platform web console as a user with
cluster-adminpermissions. -
You are logged into the OpenShift CLI (
oc).
Procedure
Create a
Namespaceobject:Create a YAML file that defines the
Namespaceobject:Example
namespace.yamlfileapiVersion: v1 kind: Namespace metadata: name: external-dns-operator
apiVersion: v1 kind: Namespace metadata: name: external-dns-operatorCopy to Clipboard Copied! Toggle word wrap Toggle overflow Create the
Namespaceobject by running the following command:oc apply -f namespace.yaml
$ oc apply -f namespace.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Create an
OperatorGroupobject:Create a YAML file that defines the
OperatorGroupobject:Example
operatorgroup.yamlfileCopy to Clipboard Copied! Toggle word wrap Toggle overflow Create the
OperatorGroupobject by running the following command:oc apply -f operatorgroup.yaml
$ oc apply -f operatorgroup.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Create a
Subscriptionobject:Create a YAML file that defines the
Subscriptionobject:Example
subscription.yamlfileCopy to Clipboard Copied! Toggle word wrap Toggle overflow Create the
Subscriptionobject by running the following command:oc apply -f subscription.yaml
$ oc apply -f subscription.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Verification
Get the name of the install plan from the subscription by running the following command:
oc -n external-dns-operator \ get subscription external-dns-operator \ --template='{{.status.installplan.name}}{{"\n"}}'$ oc -n external-dns-operator \ get subscription external-dns-operator \ --template='{{.status.installplan.name}}{{"\n"}}'Copy to Clipboard Copied! Toggle word wrap Toggle overflow Verify that the status of the install plan is
Completeby running the following command:oc -n external-dns-operator \ get ip <install_plan_name> \ --template='{{.status.phase}}{{"\n"}}'$ oc -n external-dns-operator \ get ip <install_plan_name> \ --template='{{.status.phase}}{{"\n"}}'Copy to Clipboard Copied! Toggle word wrap Toggle overflow Verify that the status of the
external-dns-operatorpod isRunningby running the following command:oc -n external-dns-operator get pod
$ oc -n external-dns-operator get podCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
NAME READY STATUS RESTARTS AGE external-dns-operator-5584585fd7-5lwqm 2/2 Running 0 11m
NAME READY STATUS RESTARTS AGE external-dns-operator-5584585fd7-5lwqm 2/2 Running 0 11mCopy to Clipboard Copied! Toggle word wrap Toggle overflow Verify that the catalog source of the subscription is
redhat-operatorsby running the following command:oc -n external-dns-operator get subscription
$ oc -n external-dns-operator get subscriptionCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
NAME PACKAGE SOURCE CHANNEL external-dns-operator external-dns-operator redhat-operators stable-v1
NAME PACKAGE SOURCE CHANNEL external-dns-operator external-dns-operator redhat-operators stable-v1Copy to Clipboard Copied! Toggle word wrap Toggle overflow Check the
external-dns-operatorversion by running the following command:oc -n external-dns-operator get csv
$ oc -n external-dns-operator get csvCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
NAME DISPLAY VERSION REPLACES PHASE external-dns-operator.v<1.y.z> ExternalDNS Operator <1.y.z> Succeeded
NAME DISPLAY VERSION REPLACES PHASE external-dns-operator.v<1.y.z> ExternalDNS Operator <1.y.z> SucceededCopy to Clipboard Copied! Toggle word wrap Toggle overflow
17.3. External DNS Operator configuration parameters Copy linkLink copied to clipboard!
The External DNS Operator includes the following configuration parameters.
17.3.1. External DNS Operator configuration parameters Copy linkLink copied to clipboard!
The External DNS Operator includes the following configuration parameters:
| Parameter | Description |
|---|---|
|
| Enables the type of a cloud provider. |
|
|
Enables you to specify DNS zones by their domains. If you do not specify zones, the zones: - "myzoneid"
|
|
|
Enables you to specify AWS zones by their domains. If you do not specify domains, the
|
|
|
Enables you to specify the source for the DNS records,
|
17.4. Creating DNS records on AWS Copy linkLink copied to clipboard!
You can create DNS records on AWS and AWS GovCloud by using External DNS Operator.
17.4.1. Creating DNS records on an public hosted zone for AWS by using Red Hat External DNS Operator Copy linkLink copied to clipboard!
You can create DNS records on a public hosted zone for AWS by using the Red Hat External DNS Operator. You can use the same instructions to create DNS records on a hosted zone for AWS GovCloud.
Procedure
Check the user. The user must have access to the
kube-systemnamespace. If you don’t have the credentials, as you can fetch the credentials from thekube-systemnamespace to use the cloud provider client:oc whoami
$ oc whoamiCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
system:admin
system:adminCopy to Clipboard Copied! Toggle word wrap Toggle overflow Fetch the values from aws-creds secret present in
kube-systemnamespace.export AWS_ACCESS_KEY_ID=$(oc get secrets aws-creds -n kube-system --template={{.data.aws_access_key_id}} | base64 -d) export AWS_SECRET_ACCESS_KEY=$(oc get secrets aws-creds -n kube-system --template={{.data.aws_secret_access_key}} | base64 -d)$ export AWS_ACCESS_KEY_ID=$(oc get secrets aws-creds -n kube-system --template={{.data.aws_access_key_id}} | base64 -d) $ export AWS_SECRET_ACCESS_KEY=$(oc get secrets aws-creds -n kube-system --template={{.data.aws_secret_access_key}} | base64 -d)Copy to Clipboard Copied! Toggle word wrap Toggle overflow Get the routes to check the domain:
oc get routes --all-namespaces | grep console
$ oc get routes --all-namespaces | grep consoleCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
openshift-console console console-openshift-console.apps.testextdnsoperator.apacshift.support console https reencrypt/Redirect None openshift-console downloads downloads-openshift-console.apps.testextdnsoperator.apacshift.support downloads http edge/Redirect None
openshift-console console console-openshift-console.apps.testextdnsoperator.apacshift.support console https reencrypt/Redirect None openshift-console downloads downloads-openshift-console.apps.testextdnsoperator.apacshift.support downloads http edge/Redirect NoneCopy to Clipboard Copied! Toggle word wrap Toggle overflow Get the list of dns zones to find the one which corresponds to the previously found route’s domain:
aws route53 list-hosted-zones | grep testextdnsoperator.apacshift.support
$ aws route53 list-hosted-zones | grep testextdnsoperator.apacshift.supportCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
HOSTEDZONES terraform /hostedzone/Z02355203TNN1XXXX1J6O testextdnsoperator.apacshift.support. 5
HOSTEDZONES terraform /hostedzone/Z02355203TNN1XXXX1J6O testextdnsoperator.apacshift.support. 5Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create
ExternalDNSresource forroutesource:Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Defines the name of external DNS resource.
- 2
- By default all hosted zones are selected as potential targets. You can include a hosted zone that you need.
- 3
- The matching of the target zone’s domain has to be exact (as opposed to regular expression match).
- 4
- Specify the exact domain of the zone you want to update. The hostname of the routes must be subdomains of the specified domain.
- 5
- Defines the
AWS Route53DNS provider. - 6
- Defines options for the source of DNS records.
- 7
- Defines OpenShift
routeresource as the source for the DNS records which gets created in the previously specified DNS provider. - 8
- If the source is
OpenShiftRoute, then you can pass the OpenShift Ingress Controller name. External DNS Operator selects the canonical hostname of that router as the target while creating CNAME record.
Check the records created for OCP routes using the following command:
aws route53 list-resource-record-sets --hosted-zone-id Z02355203TNN1XXXX1J6O --query "ResourceRecordSets[?Type == 'CNAME']" | grep console
$ aws route53 list-resource-record-sets --hosted-zone-id Z02355203TNN1XXXX1J6O --query "ResourceRecordSets[?Type == 'CNAME']" | grep consoleCopy to Clipboard Copied! Toggle word wrap Toggle overflow
17.5. Creating DNS records on Azure Copy linkLink copied to clipboard!
You can create DNS records on Azure by using the External DNS Operator.
Using the External DNS Operator on a {entra-first}-enabled cluster or a cluster that runs in Microsoft Azure Government (MAG) regions is not supported.
17.5.1. Creating DNS records on an Azure public DNS zone Copy linkLink copied to clipboard!
You can create DNS records on a public DNS zone for Azure by using the External DNS Operator.
Prerequisites
- You must have administrator privileges.
-
The
adminuser must have access to thekube-systemnamespace.
Procedure
Fetch the credentials from the
kube-systemnamespace to use the cloud provider client by running the following command:CLIENT_ID=$(oc get secrets azure-credentials -n kube-system --template={{.data.azure_client_id}} | base64 -d) CLIENT_SECRET=$(oc get secrets azure-credentials -n kube-system --template={{.data.azure_client_secret}} | base64 -d) RESOURCE_GROUP=$(oc get secrets azure-credentials -n kube-system --template={{.data.azure_resourcegroup}} | base64 -d) SUBSCRIPTION_ID=$(oc get secrets azure-credentials -n kube-system --template={{.data.azure_subscription_id}} | base64 -d) TENANT_ID=$(oc get secrets azure-credentials -n kube-system --template={{.data.azure_tenant_id}} | base64 -d)$ CLIENT_ID=$(oc get secrets azure-credentials -n kube-system --template={{.data.azure_client_id}} | base64 -d) $ CLIENT_SECRET=$(oc get secrets azure-credentials -n kube-system --template={{.data.azure_client_secret}} | base64 -d) $ RESOURCE_GROUP=$(oc get secrets azure-credentials -n kube-system --template={{.data.azure_resourcegroup}} | base64 -d) $ SUBSCRIPTION_ID=$(oc get secrets azure-credentials -n kube-system --template={{.data.azure_subscription_id}} | base64 -d) $ TENANT_ID=$(oc get secrets azure-credentials -n kube-system --template={{.data.azure_tenant_id}} | base64 -d)Copy to Clipboard Copied! Toggle word wrap Toggle overflow Log in to Azure by running the following command:
az login --service-principal -u "${CLIENT_ID}" -p "${CLIENT_SECRET}" --tenant "${TENANT_ID}"$ az login --service-principal -u "${CLIENT_ID}" -p "${CLIENT_SECRET}" --tenant "${TENANT_ID}"Copy to Clipboard Copied! Toggle word wrap Toggle overflow Get a list of routes by running the following command:
oc get routes --all-namespaces | grep console
$ oc get routes --all-namespaces | grep consoleCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
openshift-console console console-openshift-console.apps.test.azure.example.com console https reencrypt/Redirect None openshift-console downloads downloads-openshift-console.apps.test.azure.example.com downloads http edge/Redirect None
openshift-console console console-openshift-console.apps.test.azure.example.com console https reencrypt/Redirect None openshift-console downloads downloads-openshift-console.apps.test.azure.example.com downloads http edge/Redirect NoneCopy to Clipboard Copied! Toggle word wrap Toggle overflow Get a list of DNS zones by running the following command:
az network dns zone list --resource-group "${RESOURCE_GROUP}"$ az network dns zone list --resource-group "${RESOURCE_GROUP}"Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create a YAML file, for example,
external-dns-sample-azure.yaml, that defines theExternalDNSobject:Example
external-dns-sample-azure.yamlfileCopy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Specifies the External DNS name.
- 2
- Defines the zone ID.
- 3
- Defines the provider type.
- 4
- You can define options for the source of DNS records.
- 5
- If the source type is
OpenShiftRoute, you can pass the OpenShift Ingress Controller name. External DNS selects the canonical hostname of that router as the target while creating CNAME record. - 6
- Defines the
routeresource as the source for the Azure DNS records.
Check the DNS records created for OpenShift Container Platform routes by running the following command:
az network dns record-set list -g "${RESOURCE_GROUP}" -z test.azure.example.com | grep console$ az network dns record-set list -g "${RESOURCE_GROUP}" -z test.azure.example.com | grep consoleCopy to Clipboard Copied! Toggle word wrap Toggle overflow NoteTo create records on private hosted zones on private Azure DNS, you need to specify the private zone under the
zonesfield which populates the provider type toazure-private-dnsin theExternalDNScontainer arguments.
17.6. Creating DNS records on Google Cloud Copy linkLink copied to clipboard!
You can create DNS records on Google Cloud by using the External DNS Operator.
Using the External DNS Operator on a cluster with Google Cloud Workload Identity enabled is not supported. For more information about the Google Cloud Workload Identity, see Using manual mode with Google Cloud Workload Identity.
17.6.1. Creating DNS records on a public managed zone for Google Cloud Copy linkLink copied to clipboard!
You can create DNS records on a public managed zone for Google Cloud by using the External DNS Operator.
Prerequisites
- You must have administrator privileges.
Procedure
Copy the
gcp-credentialssecret in theencoded-gcloud.jsonfile by running the following command:oc get secret gcp-credentials -n kube-system --template='{{$v := index .data "service_account.json"}}{{$v}}' | base64 -d - > decoded-gcloud.json$ oc get secret gcp-credentials -n kube-system --template='{{$v := index .data "service_account.json"}}{{$v}}' | base64 -d - > decoded-gcloud.jsonCopy to Clipboard Copied! Toggle word wrap Toggle overflow Export your Google credentials by running the following command:
export GOOGLE_CREDENTIALS=decoded-gcloud.json
$ export GOOGLE_CREDENTIALS=decoded-gcloud.jsonCopy to Clipboard Copied! Toggle word wrap Toggle overflow Activate your account by using the following command:
gcloud auth activate-service-account <client_email as per decoded-gcloud.json> --key-file=decoded-gcloud.json
$ gcloud auth activate-service-account <client_email as per decoded-gcloud.json> --key-file=decoded-gcloud.jsonCopy to Clipboard Copied! Toggle word wrap Toggle overflow Set your project by running the following command:
gcloud config set project <project_id as per decoded-gcloud.json>
$ gcloud config set project <project_id as per decoded-gcloud.json>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Get a list of routes by running the following command:
oc get routes --all-namespaces | grep console
$ oc get routes --all-namespaces | grep consoleCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
openshift-console console console-openshift-console.apps.test.gcp.example.com console https reencrypt/Redirect None openshift-console downloads downloads-openshift-console.apps.test.gcp.example.com downloads http edge/Redirect None
openshift-console console console-openshift-console.apps.test.gcp.example.com console https reencrypt/Redirect None openshift-console downloads downloads-openshift-console.apps.test.gcp.example.com downloads http edge/Redirect NoneCopy to Clipboard Copied! Toggle word wrap Toggle overflow Get a list of managed zones by running the following command:
gcloud dns managed-zones list | grep test.gcp.example.com
$ gcloud dns managed-zones list | grep test.gcp.example.comCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
qe-cvs4g-private-zone test.gcp.example.com
qe-cvs4g-private-zone test.gcp.example.comCopy to Clipboard Copied! Toggle word wrap Toggle overflow Create a YAML file, for example,
external-dns-sample-gcp.yaml, that defines theExternalDNSobject:Example
external-dns-sample-gcp.yamlfileCopy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Specifies the External DNS name.
- 2
- By default, all hosted zones are selected as potential targets. You can include your hosted zone.
- 3
- The domain of the target must match the string defined by the
namekey. - 4
- Specify the exact domain of the zone you want to update. The hostname of the routes must be subdomains of the specified domain.
- 5
- Defines the provider type.
- 6
- You can define options for the source of DNS records.
- 7
- If the source type is
OpenShiftRoute, you can pass the OpenShift Ingress Controller name. External DNS selects the canonical hostname of that router as the target while creating CNAME record. - 8
- Defines the
routeresource as the source for Google Cloud DNS records.
Check the DNS records created for OpenShift Container Platform routes by running the following command:
gcloud dns record-sets list --zone=qe-cvs4g-private-zone | grep console
$ gcloud dns record-sets list --zone=qe-cvs4g-private-zone | grep consoleCopy to Clipboard Copied! Toggle word wrap Toggle overflow
17.7. Creating DNS records on Infoblox Copy linkLink copied to clipboard!
You can create DNS records on Infoblox by using the External DNS Operator.
17.7.1. Creating DNS records on a public DNS zone on Infoblox Copy linkLink copied to clipboard!
You can create DNS records on a public DNS zone on Infoblox by using the External DNS Operator.
Prerequisites
-
You have access to the OpenShift CLI (
oc). - You have access to the Infoblox UI.
Procedure
Create a
secretobject with Infoblox credentials by running the following command:oc -n external-dns-operator create secret generic infoblox-credentials --from-literal=EXTERNAL_DNS_INFOBLOX_WAPI_USERNAME=<infoblox_username> --from-literal=EXTERNAL_DNS_INFOBLOX_WAPI_PASSWORD=<infoblox_password>
$ oc -n external-dns-operator create secret generic infoblox-credentials --from-literal=EXTERNAL_DNS_INFOBLOX_WAPI_USERNAME=<infoblox_username> --from-literal=EXTERNAL_DNS_INFOBLOX_WAPI_PASSWORD=<infoblox_password>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Get a list of routes by running the following command:
oc get routes --all-namespaces | grep console
$ oc get routes --all-namespaces | grep consoleCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example Output
openshift-console console console-openshift-console.apps.test.example.com console https reencrypt/Redirect None openshift-console downloads downloads-openshift-console.apps.test.example.com downloads http edge/Redirect None
openshift-console console console-openshift-console.apps.test.example.com console https reencrypt/Redirect None openshift-console downloads downloads-openshift-console.apps.test.example.com downloads http edge/Redirect NoneCopy to Clipboard Copied! Toggle word wrap Toggle overflow Create a YAML file, for example,
external-dns-sample-infoblox.yaml, that defines theExternalDNSobject:Example
external-dns-sample-infoblox.yamlfileCopy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Specifies the External DNS name.
- 2
- Defines the provider type.
- 3
- You can define options for the source of DNS records.
- 4
- If the source type is
OpenShiftRoute, you can pass the OpenShift Ingress Controller name. External DNS selects the canonical hostname of that router as the target while creating CNAME record.
Create the
ExternalDNSresource on Infoblox by running the following command:oc create -f external-dns-sample-infoblox.yaml
$ oc create -f external-dns-sample-infoblox.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow From the Infoblox UI, check the DNS records created for
consoleroutes:- Click Data Management → DNS → Zones.
- Select the zone name.
17.8. Configuring the cluster-wide proxy on the External DNS Operator Copy linkLink copied to clipboard!
After configuring the cluster-wide proxy, the Operator Lifecycle Manager (OLM) triggers automatic updates to all of the deployed Operators with the new contents of the HTTP_PROXY, HTTPS_PROXY, and NO_PROXY environment variables.
17.8.1. Trusting the certificate authority of the cluster-wide proxy Copy linkLink copied to clipboard!
You can configure the External DNS Operator to trust the certificate authority of the cluster-wide proxy.
Procedure
Create the config map to contain the CA bundle in the
external-dns-operatornamespace by running the following command:oc -n external-dns-operator create configmap trusted-ca
$ oc -n external-dns-operator create configmap trusted-caCopy to Clipboard Copied! Toggle word wrap Toggle overflow To inject the trusted CA bundle into the config map, add the
config.openshift.io/inject-trusted-cabundle=truelabel to the config map by running the following command:oc -n external-dns-operator label cm trusted-ca config.openshift.io/inject-trusted-cabundle=true
$ oc -n external-dns-operator label cm trusted-ca config.openshift.io/inject-trusted-cabundle=trueCopy to Clipboard Copied! Toggle word wrap Toggle overflow Update the subscription of the External DNS Operator by running the following command:
oc -n external-dns-operator patch subscription external-dns-operator --type='json' -p='[{"op": "add", "path": "/spec/config", "value":{"env":[{"name":"TRUSTED_CA_CONFIGMAP_NAME","value":"trusted-ca"}]}}]'$ oc -n external-dns-operator patch subscription external-dns-operator --type='json' -p='[{"op": "add", "path": "/spec/config", "value":{"env":[{"name":"TRUSTED_CA_CONFIGMAP_NAME","value":"trusted-ca"}]}}]'Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Verification
After the deployment of the External DNS Operator is completed, verify that the trusted CA environment variable is added to the
external-dns-operatordeployment by running the following command:oc -n external-dns-operator exec deploy/external-dns-operator -c external-dns-operator -- printenv TRUSTED_CA_CONFIGMAP_NAME
$ oc -n external-dns-operator exec deploy/external-dns-operator -c external-dns-operator -- printenv TRUSTED_CA_CONFIGMAP_NAMECopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
trusted-ca
trusted-caCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Chapter 18. Network policy Copy linkLink copied to clipboard!
18.1. About network policy Copy linkLink copied to clipboard!
As a developer, you can define network policies that restrict traffic to pods in your cluster.
18.1.1. About network policy Copy linkLink copied to clipboard!
In a cluster using a network plugin that supports Kubernetes network policy, network isolation is controlled entirely by NetworkPolicy objects. In OpenShift Container Platform 4.12, OpenShift SDN supports using network policy in its default network isolation mode.
- A network policy does not apply to the host network namespace. Pods with host networking enabled are unaffected by network policy rules. However, pods connecting to the host-networked pods might be affected by the network policy rules.
-
Using the
namespaceSelectorfield without thepodSelectorfield set to{}will not includehostNetworkpods. You must use thepodSelectorset to{}with thenamespaceSelectorfield in order to targethostNetworkpods when creating network policies. - Network policies cannot block traffic from localhost or from their resident nodes.
By default, all pods in a project are accessible from other pods and network endpoints. To isolate one or more pods in a project, you can create NetworkPolicy objects in that project to indicate the allowed incoming connections. Project administrators can create and delete NetworkPolicy objects within their own project.
If a pod is matched by selectors in one or more NetworkPolicy objects, then the pod will accept only connections that are allowed by at least one of those NetworkPolicy objects. A pod that is not selected by any NetworkPolicy objects is fully accessible.
A network policy applies to only the Transmission Control Protocol (TCP), User Datagram Protocol (UDP), Internet Control Message Protocol (ICMP), and Stream Control Transmission Protocol (SCTP) protocols. Other protocols are not affected.
The following example NetworkPolicy objects demonstrate supporting different scenarios:
Deny all traffic:
To make a project deny by default, add a
NetworkPolicyobject that matches all pods but accepts no traffic:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Only allow connections from the OpenShift Container Platform Ingress Controller:
To make a project allow only connections from the OpenShift Container Platform Ingress Controller, add the following
NetworkPolicyobject.Copy to Clipboard Copied! Toggle word wrap Toggle overflow Only accept connections from pods within a project:
ImportantTo allow ingress connections from
hostNetworkpods in the same namespace, you need to apply theallow-from-hostnetworkpolicy together with theallow-same-namespacepolicy.To make pods accept connections from other pods in the same project, but reject all other connections from pods in other projects, add the following
NetworkPolicyobject:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Only allow HTTP and HTTPS traffic based on pod labels:
To enable only HTTP and HTTPS access to the pods with a specific label (
role=frontendin following example), add aNetworkPolicyobject similar to the following:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Accept connections by using both namespace and pod selectors:
To match network traffic by combining namespace and pod selectors, you can use a
NetworkPolicyobject similar to the following:Copy to Clipboard Copied! Toggle word wrap Toggle overflow
NetworkPolicy objects are additive, which means you can combine multiple NetworkPolicy objects together to satisfy complex network requirements.
For example, for the NetworkPolicy objects defined in previous samples, you can define both allow-same-namespace and allow-http-and-https policies within the same project. Thus allowing the pods with the label role=frontend, to accept any connection allowed by each policy. That is, connections on any port from pods in the same namespace, and connections on ports 80 and 443 from pods in any namespace.
18.1.1.1. Using the allow-from-router network policy Copy linkLink copied to clipboard!
Use the following NetworkPolicy to allow external traffic regardless of the router configuration:
- 1
policy-group.network.openshift.io/ingress:""label supports both OpenShift-SDN and OVN-Kubernetes.
18.1.1.2. Using the allow-from-hostnetwork network policy Copy linkLink copied to clipboard!
Add the following allow-from-hostnetwork NetworkPolicy object to direct traffic from the host network pods.
18.1.2. Optimizations for network policy with OpenShift SDN Copy linkLink copied to clipboard!
Use a network policy to isolate pods that are differentiated from one another by labels within a namespace.
It is inefficient to apply NetworkPolicy objects to large numbers of individual pods in a single namespace. Pod labels do not exist at the IP address level, so a network policy generates a separate Open vSwitch (OVS) flow rule for every possible link between every pod selected with a podSelector.
For example, if the spec podSelector and the ingress podSelector within a NetworkPolicy object each match 200 pods, then 40,000 (200*200) OVS flow rules are generated. This might slow down a node.
When designing your network policy, refer to the following guidelines:
Reduce the number of OVS flow rules by using namespaces to contain groups of pods that need to be isolated.
NetworkPolicyobjects that select a whole namespace, by using thenamespaceSelectoror an emptypodSelector, generate only a single OVS flow rule that matches the VXLAN virtual network ID (VNID) of the namespace.- Keep the pods that do not need to be isolated in their original namespace, and move the pods that require isolation into one or more different namespaces.
- Create additional targeted cross-namespace network policies to allow the specific traffic that you do want to allow from the isolated pods.
18.1.3. Optimizations for network policy with OVN-Kubernetes network plugin Copy linkLink copied to clipboard!
When designing your network policy, refer to the following guidelines:
-
For network policies with the same
spec.podSelectorspec, it is more efficient to use one network policy with multipleingressoregressrules, than multiple network policies with subsets ofingressoregressrules. Every
ingressoregressrule based on thepodSelectorornamespaceSelectorspec generates the number of OVS flows proportional tonumber of pods selected by network policy + number of pods selected by ingress or egress rule. Therefore, it is preferable to use thepodSelectorornamespaceSelectorspec that can select as many pods as you need in one rule, instead of creating individual rules for every pod.For example, the following policy contains two rules:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow The following policy expresses those same two rules as one:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow The same guideline applies to the
spec.podSelectorspec. If you have the sameingressoregressrules for different network policies, it might be more efficient to create one network policy with a commonspec.podSelectorspec. For example, the following two policies have different rules:Copy to Clipboard Copied! Toggle word wrap Toggle overflow The following network policy expresses those same two rules as one:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow You can apply this optimization when only multiple selectors are expressed as one. In cases where selectors are based on different labels, it may not be possible to apply this optimization. In those cases, consider applying some new labels for network policy optimization specifically.
18.1.3.1. NetworkPolicy CR and external IPs in OVN-Kubernetes Copy linkLink copied to clipboard!
In OVN-Kubernetes, the NetworkPolicy custom resource (CR) enforces strict isolation rules. If a service is exposed using an external IP, a network policy can block access from other namespaces unless explicitly configured to allow traffic.
To allow access to external IPs across namespaces, create a NetworkPolicy CR that explicitly permits ingress from the required namespaces and ensures traffic is allowed to the designated service ports. Without allowing traffic to the required ports, access might still be restricted.
Example output
where:
<policy_name>- Specifies your name for the policy.
<my_namespace>- Specifies the name of the namespace where the policy is deployed.
For more details, see "About network policy".
18.1.4. Next steps Copy linkLink copied to clipboard!
18.2. Creating a network policy Copy linkLink copied to clipboard!
As a user with the admin role, you can create a network policy for a namespace.
18.2.1. Example NetworkPolicy object Copy linkLink copied to clipboard!
The following annotates an example NetworkPolicy object:
- 1
- The name of the NetworkPolicy object.
- 2
- A selector that describes the pods to which the policy applies. The policy object can only select pods in the project that defines the NetworkPolicy object.
- 3
- A selector that matches the pods from which the policy object allows ingress traffic. The selector matches pods in the same namespace as the NetworkPolicy.
- 4
- A list of one or more destination ports on which to accept traffic.
18.2.2. Creating a network policy using the CLI Copy linkLink copied to clipboard!
To define granular rules describing ingress or egress network traffic allowed for namespaces in your cluster, you can create a network policy.
If you log in with a user with the cluster-admin role, then you can create a network policy in any namespace in the cluster.
Prerequisites
-
Your cluster uses a network plugin that supports
NetworkPolicyobjects, such as the OpenShift SDN network provider withmode: NetworkPolicyset. This mode is the default for OpenShift SDN. -
You installed the OpenShift CLI (
oc). -
You are logged in to the cluster with a user with
adminprivileges. - You are working in the namespace that the network policy applies to.
Procedure
Create a policy rule:
Create a
<policy_name>.yamlfile:touch <policy_name>.yaml
$ touch <policy_name>.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow where:
<policy_name>- Specifies the network policy file name.
Define a network policy in the file that you just created, such as in the following examples:
Deny ingress from all pods in all namespaces
This is a fundamental policy, blocking all cross-pod networking other than cross-pod traffic allowed by the configuration of other Network Policies.
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Allow ingress from all pods in the same namespace
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Allow ingress traffic to one pod from a particular namespace
This policy allows traffic to pods labelled
pod-afrom pods running innamespace-y.Copy to Clipboard Copied! Toggle word wrap Toggle overflow
To create the network policy object, enter the following command:
oc apply -f <policy_name>.yaml -n <namespace>
$ oc apply -f <policy_name>.yaml -n <namespace>Copy to Clipboard Copied! Toggle word wrap Toggle overflow where:
<policy_name>- Specifies the network policy file name.
<namespace>- Optional: Specifies the namespace if the object is defined in a different namespace than the current namespace.
Example output
networkpolicy.networking.k8s.io/deny-by-default created
networkpolicy.networking.k8s.io/deny-by-default createdCopy to Clipboard Copied! Toggle word wrap Toggle overflow
If you log in to the web console with cluster-admin privileges, you have a choice of creating a network policy in any namespace in the cluster directly in YAML or from a form in the web console.
18.2.3. Creating a default deny all network policy Copy linkLink copied to clipboard!
This is a fundamental policy, blocking all cross-pod networking other than network traffic allowed by the configuration of other deployed network policies. This procedure enforces a default deny-by-default policy.
If you log in with a user with the cluster-admin role, then you can create a network policy in any namespace in the cluster.
Prerequisites
-
Your cluster uses a network plugin that supports
NetworkPolicyobjects, such as the OpenShift SDN network provider withmode: NetworkPolicyset. This mode is the default for OpenShift SDN. -
You installed the OpenShift CLI (
oc). -
You are logged in to the cluster with a user with
adminprivileges. - You are working in the namespace that the network policy applies to.
Procedure
Create the following YAML that defines a
deny-by-defaultpolicy to deny ingress from all pods in all namespaces. Save the YAML in thedeny-by-default.yamlfile:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Apply the policy by entering the following command:
oc apply -f deny-by-default.yaml
$ oc apply -f deny-by-default.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
networkpolicy.networking.k8s.io/deny-by-default created
networkpolicy.networking.k8s.io/deny-by-default createdCopy to Clipboard Copied! Toggle word wrap Toggle overflow
18.2.4. Creating a network policy to allow traffic from external clients Copy linkLink copied to clipboard!
With the deny-by-default policy in place you can proceed to configure a policy that allows traffic from external clients to a pod with the label app=web.
If you log in with a user with the cluster-admin role, then you can create a network policy in any namespace in the cluster.
Follow this procedure to configure a policy that allows external service from the public Internet directly or by using a Load Balancer to access the pod. Traffic is only allowed to a pod with the label app=web.
Prerequisites
-
Your cluster uses a network plugin that supports
NetworkPolicyobjects, such as the OpenShift SDN network provider withmode: NetworkPolicyset. This mode is the default for OpenShift SDN. -
You installed the OpenShift CLI (
oc). -
You are logged in to the cluster with a user with
adminprivileges. - You are working in the namespace that the network policy applies to.
Procedure
Create a policy that allows traffic from the public Internet directly or by using a load balancer to access the pod. Save the YAML in the
web-allow-external.yamlfile:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Apply the policy by entering the following command:
oc apply -f web-allow-external.yaml
$ oc apply -f web-allow-external.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
networkpolicy.networking.k8s.io/web-allow-external created
networkpolicy.networking.k8s.io/web-allow-external createdCopy to Clipboard Copied! Toggle word wrap Toggle overflow
This policy allows traffic from all resources, including external traffic as illustrated in the following diagram:
18.2.5. Creating a network policy allowing traffic to an application from all namespaces Copy linkLink copied to clipboard!
If you log in with a user with the cluster-admin role, then you can create a network policy in any namespace in the cluster.
Follow this procedure to configure a policy that allows traffic from all pods in all namespaces to a particular application.
Prerequisites
-
Your cluster uses a network plugin that supports
NetworkPolicyobjects, such as the OpenShift SDN network provider withmode: NetworkPolicyset. This mode is the default for OpenShift SDN. -
You installed the OpenShift CLI (
oc). -
You are logged in to the cluster with a user with
adminprivileges. - You are working in the namespace that the network policy applies to.
Procedure
Create a policy that allows traffic from all pods in all namespaces to a particular application. Save the YAML in the
web-allow-all-namespaces.yamlfile:Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteBy default, if you omit specifying a
namespaceSelectorit does not select any namespaces, which means the policy allows traffic only from the namespace the network policy is deployed to.Apply the policy by entering the following command:
oc apply -f web-allow-all-namespaces.yaml
$ oc apply -f web-allow-all-namespaces.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
networkpolicy.networking.k8s.io/web-allow-all-namespaces created
networkpolicy.networking.k8s.io/web-allow-all-namespaces createdCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Verification
Start a web service in the
defaultnamespace by entering the following command:oc run web --namespace=default --image=nginx --labels="app=web" --expose --port=80
$ oc run web --namespace=default --image=nginx --labels="app=web" --expose --port=80Copy to Clipboard Copied! Toggle word wrap Toggle overflow Run the following command to deploy an
alpineimage in thesecondarynamespace and to start a shell:oc run test-$RANDOM --namespace=secondary --rm -i -t --image=alpine -- sh
$ oc run test-$RANDOM --namespace=secondary --rm -i -t --image=alpine -- shCopy to Clipboard Copied! Toggle word wrap Toggle overflow Run the following command in the shell and observe that the request is allowed:
wget -qO- --timeout=2 http://web.default
# wget -qO- --timeout=2 http://web.defaultCopy to Clipboard Copied! Toggle word wrap Toggle overflow Expected output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
18.2.6. Creating a network policy allowing traffic to an application from a namespace Copy linkLink copied to clipboard!
If you log in with a user with the cluster-admin role, then you can create a network policy in any namespace in the cluster.
Follow this procedure to configure a policy that allows traffic to a pod with the label app=web from a particular namespace. You might want to do this to:
- Restrict traffic to a production database only to namespaces where production workloads are deployed.
- Enable monitoring tools deployed to a particular namespace to scrape metrics from the current namespace.
Prerequisites
-
Your cluster uses a network plugin that supports
NetworkPolicyobjects, such as the OpenShift SDN network provider withmode: NetworkPolicyset. This mode is the default for OpenShift SDN. -
You installed the OpenShift CLI (
oc). -
You are logged in to the cluster with a user with
adminprivileges. - You are working in the namespace that the network policy applies to.
Procedure
Create a policy that allows traffic from all pods in a particular namespaces with a label
purpose=production. Save the YAML in theweb-allow-prod.yamlfile:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Apply the policy by entering the following command:
oc apply -f web-allow-prod.yaml
$ oc apply -f web-allow-prod.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
networkpolicy.networking.k8s.io/web-allow-prod created
networkpolicy.networking.k8s.io/web-allow-prod createdCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Verification
Start a web service in the
defaultnamespace by entering the following command:oc run web --namespace=default --image=nginx --labels="app=web" --expose --port=80
$ oc run web --namespace=default --image=nginx --labels="app=web" --expose --port=80Copy to Clipboard Copied! Toggle word wrap Toggle overflow Run the following command to create the
prodnamespace:oc create namespace prod
$ oc create namespace prodCopy to Clipboard Copied! Toggle word wrap Toggle overflow Run the following command to label the
prodnamespace:oc label namespace/prod purpose=production
$ oc label namespace/prod purpose=productionCopy to Clipboard Copied! Toggle word wrap Toggle overflow Run the following command to create the
devnamespace:oc create namespace dev
$ oc create namespace devCopy to Clipboard Copied! Toggle word wrap Toggle overflow Run the following command to label the
devnamespace:oc label namespace/dev purpose=testing
$ oc label namespace/dev purpose=testingCopy to Clipboard Copied! Toggle word wrap Toggle overflow Run the following command to deploy an
alpineimage in thedevnamespace and to start a shell:oc run test-$RANDOM --namespace=dev --rm -i -t --image=alpine -- sh
$ oc run test-$RANDOM --namespace=dev --rm -i -t --image=alpine -- shCopy to Clipboard Copied! Toggle word wrap Toggle overflow Run the following command in the shell and observe that the request is blocked:
wget -qO- --timeout=2 http://web.default
# wget -qO- --timeout=2 http://web.defaultCopy to Clipboard Copied! Toggle word wrap Toggle overflow Expected output
wget: download timed out
wget: download timed outCopy to Clipboard Copied! Toggle word wrap Toggle overflow Run the following command to deploy an
alpineimage in theprodnamespace and start a shell:oc run test-$RANDOM --namespace=prod --rm -i -t --image=alpine -- sh
$ oc run test-$RANDOM --namespace=prod --rm -i -t --image=alpine -- shCopy to Clipboard Copied! Toggle word wrap Toggle overflow Run the following command in the shell and observe that the request is allowed:
wget -qO- --timeout=2 http://web.default
# wget -qO- --timeout=2 http://web.defaultCopy to Clipboard Copied! Toggle word wrap Toggle overflow Expected output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
18.3. Viewing a network policy Copy linkLink copied to clipboard!
As a user with the admin role, you can view a network policy for a namespace.
18.3.1. Example NetworkPolicy object Copy linkLink copied to clipboard!
The following annotates an example NetworkPolicy object:
- 1
- The name of the NetworkPolicy object.
- 2
- A selector that describes the pods to which the policy applies. The policy object can only select pods in the project that defines the NetworkPolicy object.
- 3
- A selector that matches the pods from which the policy object allows ingress traffic. The selector matches pods in the same namespace as the NetworkPolicy.
- 4
- A list of one or more destination ports on which to accept traffic.
18.3.2. Viewing network policies using the CLI Copy linkLink copied to clipboard!
You can examine the network policies in a namespace.
If you log in with a user with the cluster-admin role, then you can view any network policy in the cluster.
Prerequisites
-
You installed the OpenShift CLI (
oc). -
You are logged in to the cluster with a user with
adminprivileges. - You are working in the namespace where the network policy exists.
Procedure
List network policies in a namespace:
To view network policy objects defined in a namespace, enter the following command:
oc get networkpolicy
$ oc get networkpolicyCopy to Clipboard Copied! Toggle word wrap Toggle overflow Optional: To examine a specific network policy, enter the following command:
oc describe networkpolicy <policy_name> -n <namespace>
$ oc describe networkpolicy <policy_name> -n <namespace>Copy to Clipboard Copied! Toggle word wrap Toggle overflow where:
<policy_name>- Specifies the name of the network policy to inspect.
<namespace>- Optional: Specifies the namespace if the object is defined in a different namespace than the current namespace.
For example:
oc describe networkpolicy allow-same-namespace
$ oc describe networkpolicy allow-same-namespaceCopy to Clipboard Copied! Toggle word wrap Toggle overflow Output for
oc describecommandCopy to Clipboard Copied! Toggle word wrap Toggle overflow
If you log in to the web console with cluster-admin privileges, you have a choice of viewing a network policy in any namespace in the cluster directly in YAML or from a form in the web console.
18.4. Editing a network policy Copy linkLink copied to clipboard!
As a user with the admin role, you can edit an existing network policy for a namespace.
18.4.1. Editing a network policy Copy linkLink copied to clipboard!
You can edit a network policy in a namespace.
If you log in with a user with the cluster-admin role, then you can edit a network policy in any namespace in the cluster.
Prerequisites
-
Your cluster uses a network plugin that supports
NetworkPolicyobjects, such as the OpenShift SDN network provider withmode: NetworkPolicyset. This mode is the default for OpenShift SDN. -
You installed the OpenShift CLI (
oc). -
You are logged in to the cluster with a user with
adminprivileges. - You are working in the namespace where the network policy exists.
Procedure
Optional: To list the network policy objects in a namespace, enter the following command:
oc get networkpolicy
$ oc get networkpolicyCopy to Clipboard Copied! Toggle word wrap Toggle overflow where:
<namespace>- Optional: Specifies the namespace if the object is defined in a different namespace than the current namespace.
Edit the network policy object.
If you saved the network policy definition in a file, edit the file and make any necessary changes, and then enter the following command.
oc apply -n <namespace> -f <policy_file>.yaml
$ oc apply -n <namespace> -f <policy_file>.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow where:
<namespace>- Optional: Specifies the namespace if the object is defined in a different namespace than the current namespace.
<policy_file>- Specifies the name of the file containing the network policy.
If you need to update the network policy object directly, enter the following command:
oc edit networkpolicy <policy_name> -n <namespace>
$ oc edit networkpolicy <policy_name> -n <namespace>Copy to Clipboard Copied! Toggle word wrap Toggle overflow where:
<policy_name>- Specifies the name of the network policy.
<namespace>- Optional: Specifies the namespace if the object is defined in a different namespace than the current namespace.
Confirm that the network policy object is updated.
oc describe networkpolicy <policy_name> -n <namespace>
$ oc describe networkpolicy <policy_name> -n <namespace>Copy to Clipboard Copied! Toggle word wrap Toggle overflow where:
<policy_name>- Specifies the name of the network policy.
<namespace>- Optional: Specifies the namespace if the object is defined in a different namespace than the current namespace.
If you log in to the web console with cluster-admin privileges, you have a choice of editing a network policy in any namespace in the cluster directly in YAML or from the policy in the web console through the Actions menu.
18.4.2. Example NetworkPolicy object Copy linkLink copied to clipboard!
The following annotates an example NetworkPolicy object:
- 1
- The name of the NetworkPolicy object.
- 2
- A selector that describes the pods to which the policy applies. The policy object can only select pods in the project that defines the NetworkPolicy object.
- 3
- A selector that matches the pods from which the policy object allows ingress traffic. The selector matches pods in the same namespace as the NetworkPolicy.
- 4
- A list of one or more destination ports on which to accept traffic.
18.5. Deleting a network policy Copy linkLink copied to clipboard!
As a user with the admin role, you can delete a network policy from a namespace.
18.5.1. Deleting a network policy using the CLI Copy linkLink copied to clipboard!
You can delete a network policy in a namespace.
If you log in with a user with the cluster-admin role, then you can delete any network policy in the cluster.
Prerequisites
-
Your cluster uses a network plugin that supports
NetworkPolicyobjects, such as the OpenShift SDN network provider withmode: NetworkPolicyset. This mode is the default for OpenShift SDN. -
You installed the OpenShift CLI (
oc). -
You are logged in to the cluster with a user with
adminprivileges. - You are working in the namespace where the network policy exists.
Procedure
To delete a network policy object, enter the following command:
oc delete networkpolicy <policy_name> -n <namespace>
$ oc delete networkpolicy <policy_name> -n <namespace>Copy to Clipboard Copied! Toggle word wrap Toggle overflow where:
<policy_name>- Specifies the name of the network policy.
<namespace>- Optional: Specifies the namespace if the object is defined in a different namespace than the current namespace.
Example output
networkpolicy.networking.k8s.io/default-deny deleted
networkpolicy.networking.k8s.io/default-deny deletedCopy to Clipboard Copied! Toggle word wrap Toggle overflow
If you log in to the web console with cluster-admin privileges, you have a choice of deleting a network policy in any namespace in the cluster directly in YAML or from the policy in the web console through the Actions menu.
18.6. Defining a default network policy for projects Copy linkLink copied to clipboard!
As a cluster administrator, you can modify the new project template to automatically include network policies when you create a new project. If you do not yet have a customized template for new projects, you must first create one.
18.6.1. Modifying the template for new projects Copy linkLink copied to clipboard!
As a cluster administrator, you can modify the default project template so that new projects are created using your custom requirements.
To create your own custom project template:
Procedure
-
Log in as a user with
cluster-adminprivileges. Generate the default project template:
oc adm create-bootstrap-project-template -o yaml > template.yaml
$ oc adm create-bootstrap-project-template -o yaml > template.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow -
Use a text editor to modify the generated
template.yamlfile by adding objects or modifying existing objects. The project template must be created in the
openshift-confignamespace. Load your modified template:oc create -f template.yaml -n openshift-config
$ oc create -f template.yaml -n openshift-configCopy to Clipboard Copied! Toggle word wrap Toggle overflow Edit the project configuration resource using the web console or CLI.
Using the web console:
- Navigate to the Administration → Cluster Settings page.
- Click Configuration to view all configuration resources.
- Find the entry for Project and click Edit YAML.
Using the CLI:
Edit the
project.config.openshift.io/clusterresource:oc edit project.config.openshift.io/cluster
$ oc edit project.config.openshift.io/clusterCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Update the
specsection to include theprojectRequestTemplateandnameparameters, and set the name of your uploaded project template. The default name isproject-request.Project configuration resource with custom project template
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - After you save your changes, create a new project to verify that your changes were successfully applied.
18.6.2. Adding network policies to the new project template Copy linkLink copied to clipboard!
As a cluster administrator, you can add network policies to the default template for new projects. OpenShift Container Platform will automatically create all the NetworkPolicy objects specified in the template in the project.
Prerequisites
-
Your cluster uses a default CNI network provider that supports
NetworkPolicyobjects, such as the OpenShift SDN network provider withmode: NetworkPolicyset. This mode is the default for OpenShift SDN. -
You installed the OpenShift CLI (
oc). -
You must log in to the cluster with a user with
cluster-adminprivileges. - You must have created a custom default project template for new projects.
Procedure
Edit the default template for a new project by running the following command:
oc edit template <project_template> -n openshift-config
$ oc edit template <project_template> -n openshift-configCopy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<project_template>with the name of the default template that you configured for your cluster. The default template name isproject-request.In the template, add each
NetworkPolicyobject as an element to theobjectsparameter. Theobjectsparameter accepts a collection of one or more objects.In the following example, the
objectsparameter collection includes severalNetworkPolicyobjects.Copy to Clipboard Copied! Toggle word wrap Toggle overflow Optional: Create a new project to confirm that your network policy objects are created successfully by running the following commands:
Create a new project:
oc new-project <project>
$ oc new-project <project>1 Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Replace
<project>with the name for the project you are creating.
Confirm that the network policy objects in the new project template exist in the new project:
oc get networkpolicy NAME POD-SELECTOR AGE allow-from-openshift-ingress <none> 7s allow-from-same-namespace <none> 7s
$ oc get networkpolicy NAME POD-SELECTOR AGE allow-from-openshift-ingress <none> 7s allow-from-same-namespace <none> 7sCopy to Clipboard Copied! Toggle word wrap Toggle overflow
18.7. Configuring multitenant isolation with network policy Copy linkLink copied to clipboard!
As a cluster administrator, you can configure your network policies to provide multitenant network isolation.
If you are using the OpenShift SDN network plugin, configuring network policies as described in this section provides network isolation similar to multitenant mode but with network policy mode set.
18.7.1. Configuring multitenant isolation by using network policy Copy linkLink copied to clipboard!
You can configure your project to isolate it from pods and services in other project namespaces.
Prerequisites
-
Your cluster uses a network plugin that supports
NetworkPolicyobjects, such as the OpenShift SDN network provider withmode: NetworkPolicyset. This mode is the default for OpenShift SDN. -
You installed the OpenShift CLI (
oc). -
You are logged in to the cluster with a user with
adminprivileges.
Procedure
Create the following
NetworkPolicyobjects:A policy named
allow-from-openshift-ingress.Copy to Clipboard Copied! Toggle word wrap Toggle overflow Notepolicy-group.network.openshift.io/ingress: ""is the preferred namespace selector label for OpenShift SDN. You can use thenetwork.openshift.io/policy-group: ingressnamespace selector label, but this is a legacy label.A policy named
allow-from-openshift-monitoring:Copy to Clipboard Copied! Toggle word wrap Toggle overflow A policy named
allow-same-namespace:Copy to Clipboard Copied! Toggle word wrap Toggle overflow A policy named
allow-from-kube-apiserver-operator:Copy to Clipboard Copied! Toggle word wrap Toggle overflow For more details, see New
kube-apiserver-operatorwebhook controller validating health of webhook.
Optional: To confirm that the network policies exist in your current project, enter the following command:
oc describe networkpolicy
$ oc describe networkpolicyCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
18.7.2. Next steps Copy linkLink copied to clipboard!
Chapter 19. CIDR range definitions Copy linkLink copied to clipboard!
You must specify non-overlapping ranges for the following CIDR ranges.
Machine CIDR ranges cannot be changed after creating your cluster.
OVN-Kubernetes, the default network provider in OpenShift Container Platform 4.11 to 4.13, uses the following IP address ranges internally: 100.64.0.0/16, 169.254.169.0/29, fd98::/64 and fd69::/125. If your cluster uses OVN-Kubernetes, do not include any IP address ranges in any other CIDR definitions in your cluster.
19.1. Machine CIDR Copy linkLink copied to clipboard!
In the Machine classless inter-domain routing (CIDR) field, you must specify the IP address range for machines or cluster nodes.
The default is 10.0.0.0/16. This range must not conflict with any connected networks.
19.2. Service CIDR Copy linkLink copied to clipboard!
In the Service CIDR field, you must specify the IP address range for services. The range must be large enough to accommodate your workload. The address block must not overlap with any external service accessed from within the cluster. The default is 172.30.0.0/16.
19.3. Pod CIDR Copy linkLink copied to clipboard!
In the pod CIDR field, you must specify the IP address range for pods.
The pod CIDR is the same as the clusterNetwork CIDR and the cluster CIDR. The range must be large enough to accommodate your workload. The address block must not overlap with any external service accessed from within the cluster. The default is 10.128.0.0/14. You can expand the range after cluster installation.
19.4. Host Prefix Copy linkLink copied to clipboard!
In the Host Prefix field, you must specify the subnet prefix length assigned to pods scheduled to individual machines. The host prefix determines the pod IP address pool for each machine.
For example, if the host prefix is set to /23, each machine is assigned a /23 subnet from the pod CIDR address range. The default is /23, allowing 510 cluster nodes, and 510 pod IP addresses per node.
Chapter 20. AWS Load Balancer Operator Copy linkLink copied to clipboard!
20.1. AWS Load Balancer Operator release notes Copy linkLink copied to clipboard!
The AWS Load Balancer (ALB) Operator deploys and manages an instance of the AWSLoadBalancerController resource.
The AWS Load Balancer (ALB) Operator is only supported on the x86_64 architecture.
These release notes track the development of the AWS Load Balancer Operator in OpenShift Container Platform.
For an overview of the AWS Load Balancer Operator, see AWS Load Balancer Operator in OpenShift Container Platform.
AWS Load Balancer Operator currently does not support AWS GovCloud.
20.1.1. AWS Load Balancer Operator 1.0.0 Copy linkLink copied to clipboard!
The AWS Load Balancer Operator is now generally available with this release. The AWS Load Balancer Operator version 1.0.0 supports the AWS Load Balancer Controller version 2.4.4.
The following advisory is available for the AWS Load Balancer Operator version 1.0.0:
20.1.1.1. Notable changes Copy linkLink copied to clipboard!
-
This release uses the new
v1API version.
20.1.1.2. Bug fixes Copy linkLink copied to clipboard!
- Previously, the controller provisioned by the AWS Load Balancer Operator did not properly use the configuration for the cluster-wide proxy. These settings are now applied appropriately to the controller. (OCPBUGS-4052, OCPBUGS-5295)
20.1.2. Earlier versions Copy linkLink copied to clipboard!
The two earliest versions of the AWS Load Balancer Operator are available as a Technology Preview. These versions should not be used in a production cluster. For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.
The following advisory is available for the AWS Load Balancer Operator version 0.2.0:
The following advisory is available for the AWS Load Balancer Operator version 0.0.1:
20.2. AWS Load Balancer Operator in OpenShift Container Platform Copy linkLink copied to clipboard!
The AWS Load Balancer Operator deploys and manages the AWS Load Balancer Controller. You can install the AWS Load Balancer Operator from OperatorHub by using OpenShift Container Platform web console or CLI.
20.2.1. AWS Load Balancer Operator considerations Copy linkLink copied to clipboard!
Review the following limitations before installing and using the AWS Load Balancer Operator:
- The IP traffic mode only works on AWS Elastic Kubernetes Service (EKS). The AWS Load Balancer Operator disables the IP traffic mode for the AWS Load Balancer Controller. As a result of disabling the IP traffic mode, the AWS Load Balancer Controller cannot use the pod readiness gate.
-
The AWS Load Balancer Operator adds command-line flags such as
--disable-ingress-class-annotationand--disable-ingress-group-name-annotationto the AWS Load Balancer Controller. Therefore, the AWS Load Balancer Operator does not allow using thekubernetes.io/ingress.classandalb.ingress.kubernetes.io/group.nameannotations in theIngressresource. -
You have configured the AWS Load Balancer Operator so that the SVC type is
NodePort(notLoadBalancerorClusterIP).
20.2.2. AWS Load Balancer Operator Copy linkLink copied to clipboard!
The AWS Load Balancer Operator can tag the public subnets if the kubernetes.io/role/elb tag is missing. Also, the AWS Load Balancer Operator detects the following information from the underlying AWS cloud:
- The ID of the virtual private cloud (VPC) on which the cluster hosting the Operator is deployed in.
- Public and private subnets of the discovered VPC.
The AWS Load Balancer Operator supports the Kubernetes service resource of type LoadBalancer by using Network Load Balancer (NLB) with the instance target type only.
Prerequisites
- You must have the AWS credentials secret. The credentials are used to provide subnet tagging and VPC discovery.
Procedure
You can deploy the AWS Load Balancer Operator on demand from OperatorHub, by creating a
Subscriptionobject by running the following command:oc -n aws-load-balancer-operator get sub aws-load-balancer-operator --template='{{.status.installplan.name}}{{"\n"}}'$ oc -n aws-load-balancer-operator get sub aws-load-balancer-operator --template='{{.status.installplan.name}}{{"\n"}}'Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
install-zlfbt
install-zlfbtCopy to Clipboard Copied! Toggle word wrap Toggle overflow Check if the status of an install plan is
Completeby running the following command:oc -n aws-load-balancer-operator get ip <install_plan_name> --template='{{.status.phase}}{{"\n"}}'$ oc -n aws-load-balancer-operator get ip <install_plan_name> --template='{{.status.phase}}{{"\n"}}'Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Complete
CompleteCopy to Clipboard Copied! Toggle word wrap Toggle overflow View the status of the
aws-load-balancer-operator-controller-managerdeployment by running the following command:oc get -n aws-load-balancer-operator deployment/aws-load-balancer-operator-controller-manager
$ oc get -n aws-load-balancer-operator deployment/aws-load-balancer-operator-controller-managerCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
NAME READY UP-TO-DATE AVAILABLE AGE aws-load-balancer-operator-controller-manager 1/1 1 1 23h
NAME READY UP-TO-DATE AVAILABLE AGE aws-load-balancer-operator-controller-manager 1/1 1 1 23hCopy to Clipboard Copied! Toggle word wrap Toggle overflow
20.2.3. AWS Load Balancer Operator logs Copy linkLink copied to clipboard!
You can view the AWS Load Balancer Operator logs by using the oc logs command.
Procedure
View the logs of the AWS Load Balancer Operator by running the following command:
oc logs -n aws-load-balancer-operator deployment/aws-load-balancer-operator-controller-manager -c manager
$ oc logs -n aws-load-balancer-operator deployment/aws-load-balancer-operator-controller-manager -c managerCopy to Clipboard Copied! Toggle word wrap Toggle overflow
20.3. Installing the AWS Load Balancer Operator Copy linkLink copied to clipboard!
The AWS Load Balancer Operator deploys and manages the AWS Load Balancer Controller. You can install the AWS Load Balancer Operator from the OperatorHub by using OpenShift Container Platform web console or CLI.
20.3.1. Installing the AWS Load Balancer Operator by using the web console Copy linkLink copied to clipboard!
You can install the AWS Load Balancer Operator by using the web console.
Prerequisites
-
You have logged in to the OpenShift Container Platform web console as a user with
cluster-adminpermissions. - Your cluster is configured with AWS as the platform type and cloud provider.
- If you are using a security token service (STS) or user-provisioned infrastructure, follow the related preparation steps. For example, if you are using AWS Security Token Service, see "Preparing for the AWS Load Balancer Operator on a cluster using the AWS Security Token Service (STS)".
Procedure
- Navigate to Operators → OperatorHub in the OpenShift Container Platform web console.
- Select the AWS Load Balancer Operator. You can use the Filter by keyword text box or use the filter list to search for the AWS Load Balancer Operator from the list of Operators.
-
Select the
aws-load-balancer-operatornamespace. On the Install Operator page, select the following options:
- Update the channel as stable-v1.
- Installation mode as All namespaces on the cluster (default).
-
Installed Namespace as
aws-load-balancer-operator. If theaws-load-balancer-operatornamespace does not exist, it gets created during the Operator installation. - Select Update approval as Automatic or Manual. By default, the Update approval is set to Automatic. If you select automatic updates, the Operator Lifecycle Manager (OLM) automatically upgrades the running instance of your Operator without any intervention. If you select manual updates, the OLM creates an update request. As a cluster administrator, you must then manually approve that update request to update the Operator updated to the new version.
- Click Install.
Verification
- Verify that the AWS Load Balancer Operator shows the Status as Succeeded on the Installed Operators dashboard.
20.3.2. Installing the AWS Load Balancer Operator by using the CLI Copy linkLink copied to clipboard!
You can install the AWS Load Balancer Operator by using the CLI.
Prerequisites
-
You are logged in to the OpenShift Container Platform web console as a user with
cluster-adminpermissions. - Your cluster is configured with AWS as the platform type and cloud provider.
-
You are logged into the OpenShift CLI (
oc).
Procedure
Create a
Namespaceobject:Create a YAML file that defines the
Namespaceobject:Example
namespace.yamlfileapiVersion: v1 kind: Namespace metadata: name: aws-load-balancer-operator
apiVersion: v1 kind: Namespace metadata: name: aws-load-balancer-operatorCopy to Clipboard Copied! Toggle word wrap Toggle overflow Create the
Namespaceobject by running the following command:oc apply -f namespace.yaml
$ oc apply -f namespace.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Create a
CredentialsRequestobject:Create a YAML file that defines the
CredentialsRequestobject:Example
credentialsrequest.yamlfileCopy to Clipboard Copied! Toggle word wrap Toggle overflow Create the
CredentialsRequestobject by running the following command:oc apply -f credentialsrequest.yaml
$ oc apply -f credentialsrequest.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Create an
OperatorGroupobject:Create a YAML file that defines the
OperatorGroupobject:Example
operatorgroup.yamlfileCopy to Clipboard Copied! Toggle word wrap Toggle overflow Create the
OperatorGroupobject by running the following command:oc apply -f operatorgroup.yaml
$ oc apply -f operatorgroup.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Create a
Subscriptionobject:Create a YAML file that defines the
Subscriptionobject:Example
subscription.yamlfileCopy to Clipboard Copied! Toggle word wrap Toggle overflow Create the
Subscriptionobject by running the following command:oc apply -f subscription.yaml
$ oc apply -f subscription.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Verification
Get the name of the install plan from the subscription:
oc -n aws-load-balancer-operator \ get subscription aws-load-balancer-operator \ --template='{{.status.installplan.name}}{{"\n"}}'$ oc -n aws-load-balancer-operator \ get subscription aws-load-balancer-operator \ --template='{{.status.installplan.name}}{{"\n"}}'Copy to Clipboard Copied! Toggle word wrap Toggle overflow Check the status of the install plan:
oc -n aws-load-balancer-operator \ get ip <install_plan_name> \ --template='{{.status.phase}}{{"\n"}}'$ oc -n aws-load-balancer-operator \ get ip <install_plan_name> \ --template='{{.status.phase}}{{"\n"}}'Copy to Clipboard Copied! Toggle word wrap Toggle overflow The output must be
Complete.
20.4. Preparing for the AWS Load Balancer Operator on a cluster using the AWS Security Token Service Copy linkLink copied to clipboard!
You can install the AWS Load Balancer Operator on a cluster that uses STS. Follow these steps to prepare your cluster before installing the Operator.
The AWS Load Balancer Operator relies on the CredentialsRequest object to bootstrap the Operator and the AWS Load Balancer Controller. The AWS Load Balancer Operator waits until the required secrets are created and available. The Cloud Credential Operator does not provision the secrets automatically in the STS cluster. You must set the credentials secrets manually by using the ccoctl binary.
If you do not want to provision credential secret by using the Cloud Credential Operator, you can configure the AWSLoadBalancerController instance on the STS cluster by specifying the credential secret in the AWS load Balancer Controller custom resource (CR).
20.4.1. Bootstrapping AWS Load Balancer Operator on Security Token Service cluster Copy linkLink copied to clipboard!
Prerequisites
-
You must extract and prepare the
ccoctlbinary.
Procedure
Create the
aws-load-balancer-operatornamespace by running the following command:oc create namespace aws-load-balancer-operator
$ oc create namespace aws-load-balancer-operatorCopy to Clipboard Copied! Toggle word wrap Toggle overflow Download the
CredentialsRequestcustom resource (CR) of the AWS Load Balancer Operator, and create a directory to store it by running the following command:curl --create-dirs -o <path-to-credrequests-dir>/cr.yaml https://raw.githubusercontent.com/openshift/aws-load-balancer-operator/main/hack/operator-credentials-request.yaml
$ curl --create-dirs -o <path-to-credrequests-dir>/cr.yaml https://raw.githubusercontent.com/openshift/aws-load-balancer-operator/main/hack/operator-credentials-request.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Use the
ccoctltool to processCredentialsRequestobjects of the AWS Load Balancer Operator, by running the following command:ccoctl aws create-iam-roles \ --name <name> --region=<aws_region> \ --credentials-requests-dir=<path-to-credrequests-dir> \ --identity-provider-arn <oidc-arn>$ ccoctl aws create-iam-roles \ --name <name> --region=<aws_region> \ --credentials-requests-dir=<path-to-credrequests-dir> \ --identity-provider-arn <oidc-arn>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Apply the secrets generated in the manifests directory of your cluster by running the following command:
ls manifests/*-credentials.yaml | xargs -I{} oc apply -f {}$ ls manifests/*-credentials.yaml | xargs -I{} oc apply -f {}Copy to Clipboard Copied! Toggle word wrap Toggle overflow Verify that the credentials secret of the AWS Load Balancer Operator is created by running the following command:
oc -n aws-load-balancer-operator get secret aws-load-balancer-operator --template='{{index .data "credentials"}}' | base64 -d$ oc -n aws-load-balancer-operator get secret aws-load-balancer-operator --template='{{index .data "credentials"}}' | base64 -dCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
[default] sts_regional_endpoints = regional role_arn = arn:aws:iam::999999999999:role/aws-load-balancer-operator-aws-load-balancer-operator web_identity_token_file = /var/run/secrets/openshift/serviceaccount/token
[default] sts_regional_endpoints = regional role_arn = arn:aws:iam::999999999999:role/aws-load-balancer-operator-aws-load-balancer-operator web_identity_token_file = /var/run/secrets/openshift/serviceaccount/tokenCopy to Clipboard Copied! Toggle word wrap Toggle overflow
20.4.2. Configuring AWS Load Balancer Operator on Security Token Service cluster by using managed CredentialsRequest objects Copy linkLink copied to clipboard!
Prerequisites
-
You must extract and prepare the
ccoctlbinary.
Procedure
The AWS Load Balancer Operator creates the
CredentialsRequestobject in theopenshift-cloud-credential-operatornamespace for eachAWSLoadBalancerControllercustom resource (CR). You can extract and save the createdCredentialsRequestobject in a directory by running the following command:oc get credentialsrequest -n openshift-cloud-credential-operator \ aws-load-balancer-controller-<cr-name> -o yaml > <path-to-credrequests-dir>/cr.yaml$ oc get credentialsrequest -n openshift-cloud-credential-operator \ aws-load-balancer-controller-<cr-name> -o yaml > <path-to-credrequests-dir>/cr.yaml1 Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- The
aws-load-balancer-controller-<cr-name>parameter specifies the credential request name created by the AWS Load Balancer Operator. Thecr-namespecifies the name of the AWS Load Balancer Controller instance.
Use the
ccoctltool to process allCredentialsRequestobjects in thecredrequestsdirectory by running the following command:ccoctl aws create-iam-roles \ --name <name> --region=<aws_region> \ --credentials-requests-dir=<path-to-credrequests-dir> \ --identity-provider-arn <oidc-arn>$ ccoctl aws create-iam-roles \ --name <name> --region=<aws_region> \ --credentials-requests-dir=<path-to-credrequests-dir> \ --identity-provider-arn <oidc-arn>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Apply the secrets generated in manifests directory to your cluster, by running the following command:
ls manifests/*-credentials.yaml | xargs -I{} oc apply -f {}$ ls manifests/*-credentials.yaml | xargs -I{} oc apply -f {}Copy to Clipboard Copied! Toggle word wrap Toggle overflow Verify that the
aws-load-balancer-controllerpod is created:oc -n aws-load-balancer-operator get pods NAME READY STATUS RESTARTS AGE aws-load-balancer-controller-cluster-9b766d6-gg82c 1/1 Running 0 137m aws-load-balancer-operator-controller-manager-b55ff68cc-85jzg 2/2 Running 0 3h26m
$ oc -n aws-load-balancer-operator get pods NAME READY STATUS RESTARTS AGE aws-load-balancer-controller-cluster-9b766d6-gg82c 1/1 Running 0 137m aws-load-balancer-operator-controller-manager-b55ff68cc-85jzg 2/2 Running 0 3h26mCopy to Clipboard Copied! Toggle word wrap Toggle overflow
20.4.3. Configuring the AWS Load Balancer Operator on Security Token Service cluster by using specific credentials Copy linkLink copied to clipboard!
You can specify the credential secret by using the spec.credentials field in the AWS Load Balancer Controller custom resource (CR). You can use the predefined CredentialsRequest object of the controller to know which roles are required.
Prerequisites
-
You must extract and prepare the
ccoctlbinary.
Procedure
Download the CredentialsRequest custom resource (CR) of the AWS Load Balancer Controller, and create a directory to store it by running the following command:
curl --create-dirs -o <path-to-credrequests-dir>/cr.yaml https://raw.githubusercontent.com/openshift/aws-load-balancer-operator/main/hack/controller/controller-credentials-request.yaml
$ curl --create-dirs -o <path-to-credrequests-dir>/cr.yaml https://raw.githubusercontent.com/openshift/aws-load-balancer-operator/main/hack/controller/controller-credentials-request.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Use the
ccoctltool to process theCredentialsRequestobject of the controller:ccoctl aws create-iam-roles \ --name <name> --region=<aws_region> \ --credentials-requests-dir=<path-to-credrequests-dir> \ --identity-provider-arn <oidc-arn>$ ccoctl aws create-iam-roles \ --name <name> --region=<aws_region> \ --credentials-requests-dir=<path-to-credrequests-dir> \ --identity-provider-arn <oidc-arn>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Apply the secrets to your cluster:
ls manifests/*-credentials.yaml | xargs -I{} oc apply -f {}$ ls manifests/*-credentials.yaml | xargs -I{} oc apply -f {}Copy to Clipboard Copied! Toggle word wrap Toggle overflow Verify the credentials secret has been created for use by the controller:
oc -n aws-load-balancer-operator get secret aws-load-balancer-controller-manual-cluster --template='{{index .data "credentials"}}' | base64 -d$ oc -n aws-load-balancer-operator get secret aws-load-balancer-controller-manual-cluster --template='{{index .data "credentials"}}' | base64 -dCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
[default] sts_regional_endpoints = regional role_arn = arn:aws:iam::999999999999:role/aws-load-balancer-operator-aws-load-balancer-controller web_identity_token_file = /var/run/secrets/openshift/serviceaccount/token[default] sts_regional_endpoints = regional role_arn = arn:aws:iam::999999999999:role/aws-load-balancer-operator-aws-load-balancer-controller web_identity_token_file = /var/run/secrets/openshift/serviceaccount/tokenCopy to Clipboard Copied! Toggle word wrap Toggle overflow Create the
AWSLoadBalancerControllerresource YAML file, for example,sample-aws-lb-manual-creds.yaml, as follows:Copy to Clipboard Copied! Toggle word wrap Toggle overflow
20.5. Creating an instance of the AWS Load Balancer Controller Copy linkLink copied to clipboard!
After installing the AWS Load Balancer Operator, you can create the AWS Load Balancer Controller.
20.5.1. Creating the AWS Load Balancer Controller Copy linkLink copied to clipboard!
You can install only a single instance of the AWSLoadBalancerController object in a cluster. You can create the AWS Load Balancer Controller by using CLI. The AWS Load Balancer Operator reconciles only the cluster named resource.
Prerequisites
-
You have created the
echoservernamespace. -
You have access to the OpenShift CLI (
oc).
Procedure
Create a YAML file that defines the
AWSLoadBalancerControllerobject:Example
sample-aws-lb.yamlfileCopy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Defines the
AWSLoadBalancerControllerobject. - 2
- Defines the AWS Load Balancer Controller name. This instance name gets added as a suffix to all related resources.
- 3
- Configures the subnet tagging method for the AWS Load Balancer Controller. The following values are valid:
-
Auto: The AWS Load Balancer Operator determines the subnets that belong to the cluster and tags them appropriately. The Operator cannot determine the role correctly if the internal subnet tags are not present on internal subnet. -
Manual: You manually tag the subnets that belong to the cluster with the appropriate role tags. Use this option if you installed your cluster on user-provided infrastructure.
-
- 4
- Defines the tags used by the AWS Load Balancer Controller when it provisions AWS resources.
- 5
- Defines the ingress class name. The default value is
alb. - 6
- Specifies the number of replicas of the AWS Load Balancer Controller.
- 7
- Specifies annotations as an add-on for the AWS Load Balancer Controller.
- 8
- Enables the
alb.ingress.kubernetes.io/wafv2-acl-arnannotation.
Create the
AWSLoadBalancerControllerobject by running the following command:oc create -f sample-aws-lb.yaml
$ oc create -f sample-aws-lb.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Create a YAML file that defines the
Deploymentresource:Example
sample-aws-lb.yamlfileCopy to Clipboard Copied! Toggle word wrap Toggle overflow Create a YAML file that defines the
Serviceresource:Example
service-albo.yamlfile:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create a YAML file that defines the
Ingressresource:Example
ingress-albo.yamlfile:Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Verification
Save the status of the
Ingressresource in theHOSTvariable by running the following command:HOST=$(oc get ingress -n echoserver echoserver --template='{{(index .status.loadBalancer.ingress 0).hostname}}')$ HOST=$(oc get ingress -n echoserver echoserver --template='{{(index .status.loadBalancer.ingress 0).hostname}}')Copy to Clipboard Copied! Toggle word wrap Toggle overflow Verify the status of the
Ingressresource by running the following command:curl $HOST
$ curl $HOSTCopy to Clipboard Copied! Toggle word wrap Toggle overflow
20.6. Serving multiple ingress resources through a single AWS Load Balancer Copy linkLink copied to clipboard!
You can route the traffic to different services that are part of a single domain through a single AWS Load Balancer. Each Ingress resource provides different endpoints of the domain.
20.6.1. Creating multiple ingress resources through a single AWS Load Balancer Copy linkLink copied to clipboard!
You can route the traffic to multiple ingress resources through a single AWS Load Balancer by using the CLI.
Prerequisites
-
You have an access to the OpenShift CLI (
oc).
Procedure
Create an
IngressClassParamsresource YAML file, for example,sample-single-lb-params.yaml, as follows:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create the
IngressClassParamsresource by running the following command:oc create -f sample-single-lb-params.yaml
$ oc create -f sample-single-lb-params.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Create the
IngressClassresource YAML file, for example,sample-single-lb-class.yaml, as follows:Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Defines the API group and version of the
IngressClassresource. - 2
- Specifies the ingress class name.
- 3
- Defines the controller name. The
ingress.k8s.aws/albvalue denotes that all ingress resources of this class should be managed by the AWS Load Balancer Controller. - 4
- Defines the API group of the
IngressClassParamsresource. - 5
- Defines the resource type of the
IngressClassParamsresource. - 6
- Defines the
IngressClassParamsresource name.
Create the
IngressClassresource by running the following command:oc create -f sample-single-lb-class.yaml
$ oc create -f sample-single-lb-class.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Create the
AWSLoadBalancerControllerresource YAML file, for example,sample-single-lb.yaml, as follows:Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Defines the name of the
IngressClassresource.
Create the
AWSLoadBalancerControllerresource by running the following command:oc create -f sample-single-lb.yaml
$ oc create -f sample-single-lb.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Create the
Ingressresource YAML file, for example,sample-multiple-ingress.yaml, as follows:Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Specifies the ingress name.
- 2
- Indicates the load balancer to provision in the public subnet to access the internet.
- 3
- Specifies the order in which the rules from the multiple ingress resources are matched when the request is received at the load balancer.
- 4
- Indicates that the load balancer will target OpenShift Container Platform nodes to reach the service.
- 5
- Specifies the ingress class that belongs to this ingress.
- 6
- Defines a domain name used for request routing.
- 7
- Defines the path that must route to the service.
- 8
- Defines the service name that serves the endpoint configured in the
Ingressresource. - 9
- Defines the port on the service that serves the endpoint.
Create the
Ingressresource by running the following command:oc create -f sample-multiple-ingress.yaml
$ oc create -f sample-multiple-ingress.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
20.7. Adding TLS termination Copy linkLink copied to clipboard!
You can add TLS termination on the AWS Load Balancer.
20.7.1. Adding TLS termination on the AWS Load Balancer Copy linkLink copied to clipboard!
You can route the traffic for the domain to pods of a service and add TLS termination on the AWS Load Balancer.
Prerequisites
-
You have an access to the OpenShift CLI (
oc).
Procedure
Create a YAML file that defines the
AWSLoadBalancerControllerresource:Example
add-tls-termination-albc.yamlfileCopy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Defines the ingress class name. If the ingress class is not present in your cluster the AWS Load Balancer Controller creates one. The AWS Load Balancer Controller reconciles the additional ingress class values if
spec.controlleris set toingress.k8s.aws/alb.
Create a YAML file that defines the
Ingressresource:Example
add-tls-termination-ingress.yamlfileCopy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Specifies the ingress name.
- 2
- The controller provisions the load balancer for ingress in a public subnet to access the load balancer over the internet.
- 3
- The Amazon Resource Name (ARN) of the certificate that you attach to the load balancer.
- 4
- Defines the ingress class name.
- 5
- Defines the domain for traffic routing.
- 6
- Defines the service for traffic routing.
20.8. Configuring cluster-wide proxy Copy linkLink copied to clipboard!
You can configure the cluster-wide proxy in the AWS Load Balancer Operator. After configuring the cluster-wide proxy, Operator Lifecycle Manager (OLM) automatically updates all the deployments of the Operators with the environment variables such as HTTP_PROXY, HTTPS_PROXY, and NO_PROXY. These variables are populated to the managed controller by the AWS Load Balancer Operator.
20.8.1. Trusting the certificate authority of the cluster-wide proxy Copy linkLink copied to clipboard!
Create the config map to contain the certificate authority (CA) bundle in the
aws-load-balancer-operatornamespace by running the following command:oc -n aws-load-balancer-operator create configmap trusted-ca
$ oc -n aws-load-balancer-operator create configmap trusted-caCopy to Clipboard Copied! Toggle word wrap Toggle overflow To inject the trusted CA bundle into the config map, add the
config.openshift.io/inject-trusted-cabundle=truelabel to the config map by running the following command:oc -n aws-load-balancer-operator label cm trusted-ca config.openshift.io/inject-trusted-cabundle=true
$ oc -n aws-load-balancer-operator label cm trusted-ca config.openshift.io/inject-trusted-cabundle=trueCopy to Clipboard Copied! Toggle word wrap Toggle overflow Update the AWS Load Balancer Operator subscription to access the config map in the AWS Load Balancer Operator deployment by running the following command:
oc -n aws-load-balancer-operator patch subscription aws-load-balancer-operator --type='merge' -p '{"spec":{"config":{"env":[{"name":"TRUSTED_CA_CONFIGMAP_NAME","value":"trusted-ca"}],"volumes":[{"name":"trusted-ca","configMap":{"name":"trusted-ca"}}],"volumeMounts":[{"name":"trusted-ca","mountPath":"/etc/pki/tls/certs/albo-tls-ca-bundle.crt","subPath":"ca-bundle.crt"}]}}}'$ oc -n aws-load-balancer-operator patch subscription aws-load-balancer-operator --type='merge' -p '{"spec":{"config":{"env":[{"name":"TRUSTED_CA_CONFIGMAP_NAME","value":"trusted-ca"}],"volumes":[{"name":"trusted-ca","configMap":{"name":"trusted-ca"}}],"volumeMounts":[{"name":"trusted-ca","mountPath":"/etc/pki/tls/certs/albo-tls-ca-bundle.crt","subPath":"ca-bundle.crt"}]}}}'Copy to Clipboard Copied! Toggle word wrap Toggle overflow After the AWS Load Balancer Operator is deployed, verify that the CA bundle is added to the
aws-load-balancer-operator-controller-managerdeployment by running the following command:oc -n aws-load-balancer-operator exec deploy/aws-load-balancer-operator-controller-manager -c manager -- bash -c "ls -l /etc/pki/tls/certs/albo-tls-ca-bundle.crt; printenv TRUSTED_CA_CONFIGMAP_NAME"
$ oc -n aws-load-balancer-operator exec deploy/aws-load-balancer-operator-controller-manager -c manager -- bash -c "ls -l /etc/pki/tls/certs/albo-tls-ca-bundle.crt; printenv TRUSTED_CA_CONFIGMAP_NAME"Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
-rw-r--r--. 1 root 1000690000 5875 Jan 11 12:25 /etc/pki/tls/certs/albo-tls-ca-bundle.crt trusted-ca
-rw-r--r--. 1 root 1000690000 5875 Jan 11 12:25 /etc/pki/tls/certs/albo-tls-ca-bundle.crt trusted-caCopy to Clipboard Copied! Toggle word wrap Toggle overflow Optional: Restart deployment of the AWS Load Balancer Operator every time the config map changes by running the following command:
oc -n aws-load-balancer-operator rollout restart deployment/aws-load-balancer-operator-controller-manager
$ oc -n aws-load-balancer-operator rollout restart deployment/aws-load-balancer-operator-controller-managerCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Chapter 21. Multiple networks Copy linkLink copied to clipboard!
21.1. Understanding multiple networks Copy linkLink copied to clipboard!
In Kubernetes, container networking is delegated to networking plugins that implement the Container Network Interface (CNI).
OpenShift Container Platform uses the Multus CNI plugin to allow chaining of CNI plugins. During cluster installation, you configure your default pod network. The default network handles all ordinary network traffic for the cluster. You can define an additional network based on the available CNI plugins and attach one or more of these networks to your pods. You can define more than one additional network for your cluster, depending on your needs. This gives you flexibility when you configure pods that deliver network functionality, such as switching or routing.
21.1.1. Usage scenarios for an additional network Copy linkLink copied to clipboard!
You can use an additional network in situations where network isolation is needed, including data plane and control plane separation. Isolating network traffic is useful for the following performance and security reasons:
- Performance
- You can send traffic on two different planes to manage how much traffic is along each plane.
- Security
- You can send sensitive traffic onto a network plane that is managed specifically for security considerations, and you can separate private data that must not be shared between tenants or customers.
All of the pods in the cluster still use the cluster-wide default network to maintain connectivity across the cluster. Every pod has an eth0 interface that is attached to the cluster-wide pod network. You can view the interfaces for a pod by using the oc exec -it <pod_name> -- ip a command. If you add additional network interfaces that use Multus CNI, they are named net1, net2, …, netN.
To attach additional network interfaces to a pod, you must create configurations that define how the interfaces are attached. You specify each interface by using a NetworkAttachmentDefinition custom resource (CR). A CNI configuration inside each of these CRs defines how that interface is created.
21.1.2. Additional networks in OpenShift Container Platform Copy linkLink copied to clipboard!
OpenShift Container Platform provides the following CNI plugins for creating additional networks in your cluster:
- bridge: Configure a bridge-based additional network to allow pods on the same host to communicate with each other and the host.
- host-device: Configure a host-device additional network to allow pods access to a physical Ethernet network device on the host system.
- ipvlan: Configure an ipvlan-based additional network to allow pods on a host to communicate with other hosts and pods on those hosts, similar to a macvlan-based additional network. Unlike a macvlan-based additional network, each pod shares the same MAC address as the parent physical network interface.
- macvlan: Configure a macvlan-based additional network to allow pods on a host to communicate with other hosts and pods on those hosts by using a physical network interface. Each pod that is attached to a macvlan-based additional network is provided a unique MAC address.
- SR-IOV: Configure an SR-IOV based additional network to allow pods to attach to a virtual function (VF) interface on SR-IOV capable hardware on the host system.
21.2. Configuring an additional network Copy linkLink copied to clipboard!
As a cluster administrator, you can configure an additional network for your cluster. The following network types are supported:
21.2.1. Approaches to managing an additional network Copy linkLink copied to clipboard!
You can manage the lifecycle of an additional network in OpenShift Container Platform by using one of two approaches: modifying the Cluster Network Operator (CNO) configuration or applying a YAML manifest. Each approach is mutually exclusive and you can only use one approach for managing an additional network at a time. For either approach, the additional network is managed by a Container Network Interface (CNI) plugin that you configure. The two different approaches are summarized here:
-
Modifying the Cluster Network Operator (CNO) configuration: Configuring additional networks through CNO is only possible for cluster administrators. The CNO automatically creates and manages the
NetworkAttachmentDefinitionobject. By using this approach, you can defineNetworkAttachmentDefinitionobjects at install time through configuration of theinstall-config. -
Applying a YAML manifest: You can manage the additional network directly by creating an
NetworkAttachmentDefinitionobject. Compared to modifying the CNO configuration, this approach gives you more granular control and flexibility when it comes to configuration.
When deploying OpenShift Container Platform nodes with multiple network interfaces on Red Hat OpenStack Platform (RHOSP) with OVN Kubernetes, DNS configuration of the secondary interface might take precedence over the DNS configuration of the primary interface. In this case, remove the DNS nameservers for the subnet ID that is attached to the secondary interface:
openstack subnet set --dns-nameserver 0.0.0.0 <subnet_id>
$ openstack subnet set --dns-nameserver 0.0.0.0 <subnet_id>
21.2.2. IP address assignment for additional networks Copy linkLink copied to clipboard!
For additional networks, IP addresses can be assigned using an IP Address Management (IPAM) CNI plugin, which supports various assignment methods, including Dynamic Host Configuration Protocol (DHCP) and static assignment.
The DHCP IPAM CNI plugin responsible for dynamic assignment of IP addresses operates with two distinct components:
- CNI Plugin: Responsible for integrating with the Kubernetes networking stack to request and release IP addresses.
- DHCP IPAM CNI Daemon: A listener for DHCP events that coordinates with existing DHCP servers in the environment to handle IP address assignment requests. This daemon is not a DHCP server itself.
For networks requiring type: dhcp in their IPAM configuration, ensure the following:
- A DHCP server is available and running in the environment. The DHCP server is external to the cluster and is expected to be part of the customer’s existing network infrastructure.
- The DHCP server is appropriately configured to serve IP addresses to the nodes.
In cases where a DHCP server is unavailable in the environment, it is recommended to use the Whereabouts IPAM CNI plugin instead. The Whereabouts CNI provides similar IP address management capabilities without the need for an external DHCP server.
Use the Whereabouts CNI plugin when there is no external DHCP server or where static IP address management is preferred. The Whereabouts plugin includes a reconciler daemon to manage stale IP address allocations.
A DHCP lease must be periodically renewed throughout the container’s lifetime, so a separate daemon, the DHCP IPAM CNI Daemon, is required. To deploy the DHCP IPAM CNI daemon, modify the Cluster Network Operator (CNO) configuration to trigger the deployment of this daemon as part of the additional network setup.
21.2.3. Configuration for an additional network attachment Copy linkLink copied to clipboard!
An additional network is configured by using the NetworkAttachmentDefinition API in the k8s.cni.cncf.io API group.
Do not store any sensitive information or a secret in the NetworkAttachmentDefinition object because this information is accessible by the project administration user.
The configuration for the API is described in the following table:
| Field | Type | Description |
|---|---|---|
|
|
| The name for the additional network. |
|
|
| The namespace that the object is associated with. |
|
|
| The CNI plugin configuration in JSON format. |
21.2.3.1. Configuration of an additional network through the Cluster Network Operator Copy linkLink copied to clipboard!
The configuration for an additional network attachment is specified as part of the Cluster Network Operator (CNO) configuration.
The following YAML describes the configuration parameters for managing an additional network with the CNO:
Cluster Network Operator configuration
- 1
- An array of one or more additional network configurations.
- 2
- The name for the additional network attachment that you are creating. The name must be unique within the specified
namespace. - 3
- The namespace to create the network attachment in. If you do not specify a value, then the
defaultnamespace is used. - 4
- A CNI plugin configuration in JSON format.
21.2.3.2. Configuration of an additional network from a YAML manifest Copy linkLink copied to clipboard!
The configuration for an additional network is specified from a YAML configuration file, such as in the following example:
21.2.4. Configurations for additional network types Copy linkLink copied to clipboard!
The specific configuration fields for additional networks is described in the following sections.
21.2.4.1. Configuration for a bridge additional network Copy linkLink copied to clipboard!
The following object describes the configuration parameters for the bridge CNI plugin:
| Field | Type | Description |
|---|---|---|
|
|
|
The CNI specification version. The |
|
|
|
The value for the |
|
|
|
The name of the CNI plugin to configure: |
|
|
| The configuration object for the IPAM CNI plugin. The plugin manages IP address assignment for the attachment definition. |
|
|
|
Optional: Specify the name of the virtual bridge to use. If the bridge interface does not exist on the host, it is created. The default value is |
|
|
|
Optional: Set to |
|
|
|
Optional: Set to |
|
|
|
Optional: Set to |
|
|
|
Optional: Set to |
|
|
|
Optional: Set to |
|
|
|
Optional: Set to |
|
|
| Optional: Specify a virtual LAN (VLAN) tag as an integer value. By default, no VLAN tag is assigned. |
|
|
|
Optional: Indicates whether the default vlan must be preserved on the |
|
|
| Optional: Set the maximum transmission unit (MTU) to the specified value. The default value is automatically set by the kernel. |
|
|
|
Optional: Enables duplicate address detection for the container side |
|
|
|
Optional: Enables mac spoof check, limiting the traffic originating from the container to the mac address of the interface. The default value is |
The VLAN parameter configures the VLAN tag on the host end of the veth and also enables the vlan_filtering feature on the bridge interface.
To configure uplink for a L2 network you need to allow the vlan on the uplink interface by using the following command:
bridge vlan add vid VLAN_ID dev DEV
$ bridge vlan add vid VLAN_ID dev DEV
21.2.4.1.1. bridge configuration example Copy linkLink copied to clipboard!
The following example configures an additional network named bridge-net:
21.2.4.2. Configuration for a host device additional network Copy linkLink copied to clipboard!
Specify your network device by setting only one of the following parameters: device,hwaddr, kernelpath, or pciBusID.
The following object describes the configuration parameters for the host-device CNI plugin:
| Field | Type | Description |
|---|---|---|
|
|
|
The CNI specification version. The |
|
|
|
The value for the |
|
|
|
The name of the CNI plugin to configure: |
|
|
|
Optional: The name of the device, such as |
|
|
| Optional: The device hardware MAC address. |
|
|
|
Optional: The Linux kernel device path, such as |
|
|
|
Optional: The PCI address of the network device, such as |
21.2.4.2.1. host-device configuration example Copy linkLink copied to clipboard!
The following example configures an additional network named hostdev-net:
21.2.4.3. Configuration for an IPVLAN additional network Copy linkLink copied to clipboard!
The following object describes the configuration parameters for the IPVLAN CNI plugin:
| Field | Type | Description |
|---|---|---|
|
|
|
The CNI specification version. The |
|
|
|
The value for the |
|
|
|
The name of the CNI plugin to configure: |
|
|
| The configuration object for the IPAM CNI plugin. The plugin manages IP address assignment for the attachment definition. This is required unless the plugin is chained. |
|
|
|
Optional: The operating mode for the virtual network. The value must be |
|
|
|
Optional: The Ethernet interface to associate with the network attachment. If a |
|
|
| Optional: Set the maximum transmission unit (MTU) to the specified value. The default value is automatically set by the kernel. |
-
The
ipvlanobject does not allow virtual interfaces to communicate with themasterinterface. Therefore the container will not be able to reach the host by using theipvlaninterface. Be sure that the container joins a network that provides connectivity to the host, such as a network supporting the Precision Time Protocol (PTP). -
A single
masterinterface cannot simultaneously be configured to use bothmacvlanandipvlan. -
For IP allocation schemes that cannot be interface agnostic, the
ipvlanplugin can be chained with an earlier plugin that handles this logic. If themasteris omitted, then the previous result must contain a single interface name for theipvlanplugin to enslave. Ifipamis omitted, then the previous result is used to configure theipvlaninterface.
21.2.4.3.1. ipvlan configuration example Copy linkLink copied to clipboard!
The following example configures an additional network named ipvlan-net:
21.2.4.4. Configuration for a MACVLAN additional network Copy linkLink copied to clipboard!
The following object describes the configuration parameters for the MAC Virtual LAN (MACVLAN) Container Network Interface (CNI) plugin:
| Field | Type | Description |
|---|---|---|
|
|
|
The CNI specification version. The |
|
|
|
The value for the |
|
|
|
The name of the CNI plugin to configure: |
|
|
| The configuration object for the IPAM CNI plugin. The plugin manages IP address assignment for the attachment definition. |
|
|
|
Optional: Configures traffic visibility on the virtual network. Must be either |
|
|
| Optional: The host network interface to associate with the newly created macvlan interface. If a value is not specified, then the default route interface is used. |
|
|
| Optional: The maximum transmission unit (MTU) to the specified value. The default value is automatically set by the kernel. |
If you specify the master key for the plugin configuration, use a different physical network interface than the one that is associated with your primary network plugin to avoid possible conflicts.
21.2.4.4.1. MACVLAN configuration example Copy linkLink copied to clipboard!
The following example configures an additional network named macvlan-net:
21.2.5. Configuration of IP address assignment for an additional network Copy linkLink copied to clipboard!
The IP address management (IPAM) Container Network Interface (CNI) plugin provides IP addresses for other CNI plugins.
You can use the following IP address assignment types:
- Static assignment.
- Dynamic assignment through a DHCP server. The DHCP server you specify must be reachable from the additional network.
- Dynamic assignment through the Whereabouts IPAM CNI plugin.
21.2.5.1. Static IP address assignment configuration Copy linkLink copied to clipboard!
The following table describes the configuration for static IP address assignment:
| Field | Type | Description |
|---|---|---|
|
|
|
The IPAM address type. The value |
|
|
| An array of objects specifying IP addresses to assign to the virtual interface. Both IPv4 and IPv6 IP addresses are supported. |
|
|
| An array of objects specifying routes to configure inside the pod. |
|
|
| Optional: An array of objects specifying the DNS configuration. |
The addresses array requires objects with the following fields:
| Field | Type | Description |
|---|---|---|
|
|
|
An IP address and network prefix that you specify. For example, if you specify |
|
|
| The default gateway to route egress network traffic to. |
| Field | Type | Description |
|---|---|---|
|
|
|
The IP address range in CIDR format, such as |
|
|
| The gateway where network traffic is routed. |
| Field | Type | Description |
|---|---|---|
|
|
| An array of one or more IP addresses for to send DNS queries to. |
|
|
|
The default domain to append to a hostname. For example, if the domain is set to |
|
|
|
An array of domain names to append to an unqualified hostname, such as |
Static IP address assignment configuration example
21.2.5.2. Dynamic IP address (DHCP) assignment configuration Copy linkLink copied to clipboard!
The following JSON describes the configuration for dynamic IP address address assignment with DHCP.
A pod obtains its original DHCP lease when it is created. The lease must be periodically renewed by a minimal DHCP server deployment running on the cluster.
To trigger the deployment of the DHCP server, you must create a shim network attachment by editing the Cluster Network Operator configuration, as in the following example:
Example shim network attachment definition
| Field | Type | Description |
|---|---|---|
|
|
|
The IPAM address type. The value |
Dynamic IP address (DHCP) assignment configuration example
{
"ipam": {
"type": "dhcp"
}
}
{
"ipam": {
"type": "dhcp"
}
}
21.2.5.3. Dynamic IP address assignment configuration with Whereabouts Copy linkLink copied to clipboard!
The Whereabouts CNI plugin allows the dynamic assignment of an IP address to an additional network without the use of a DHCP server.
The following table describes the configuration for dynamic IP address assignment with Whereabouts:
| Field | Type | Description |
|---|---|---|
|
|
|
The IPAM address type. The value |
|
|
| An IP address and range in CIDR notation. IP addresses are assigned from within this range of addresses. |
|
|
| Optional: A list of zero or more IP addresses and ranges in CIDR notation. IP addresses within an excluded address range are not assigned. |
Dynamic IP address assignment configuration example that uses Whereabouts
21.2.5.4. Creating a Whereabouts reconciler daemon set Copy linkLink copied to clipboard!
The Whereabouts reconciler is responsible for managing dynamic IP address assignments for the pods within a cluster using the Whereabouts IP Address Management (IPAM) solution. It ensures that each pods gets a unique IP address from the specified IP address range. It also handles IP address releases when pods are deleted or scaled down.
You can also use a NetworkAttachmentDefinition custom resource for dynamic IP address assignment.
The Whereabouts reconciler daemon set is automatically created when you configure an additional network through the Cluster Network Operator. It is not automatically created when you configure an additional network from a YAML manifest.
To trigger the deployment of the Whereabouts reconciler daemonset, you must manually create a whereabouts-shim network attachment by editing the Cluster Network Operator custom resource file.
Use the following procedure to deploy the Whereabouts reconciler daemonset.
Procedure
Edit the
Network.operator.openshift.iocustom resource (CR) by running the following command:oc edit network.operator.openshift.io cluster
$ oc edit network.operator.openshift.io clusterCopy to Clipboard Copied! Toggle word wrap Toggle overflow Modify the
additionalNetworksparameter in the CR to add thewhereabouts-shimnetwork attachment definition. For example:Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Save the file and exit the text editor.
Verify that the
whereabouts-reconcilerdaemon set deployed successfully by running the following command:oc get all -n openshift-multus | grep whereabouts-reconciler
$ oc get all -n openshift-multus | grep whereabouts-reconcilerCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
21.2.6. Creating an additional network attachment with the Cluster Network Operator Copy linkLink copied to clipboard!
The Cluster Network Operator (CNO) manages additional network definitions. When you specify an additional network to create, the CNO creates the NetworkAttachmentDefinition object automatically.
Do not edit the NetworkAttachmentDefinition objects that the Cluster Network Operator manages. Doing so might disrupt network traffic on your additional network.
Prerequisites
-
Install the OpenShift CLI (
oc). -
Log in as a user with
cluster-adminprivileges.
Procedure
Optional: Create the namespace for the additional networks:
oc create namespace <namespace_name>
$ oc create namespace <namespace_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow To edit the CNO configuration, enter the following command:
oc edit networks.operator.openshift.io cluster
$ oc edit networks.operator.openshift.io clusterCopy to Clipboard Copied! Toggle word wrap Toggle overflow Modify the CR that you are creating by adding the configuration for the additional network that you are creating, as in the following example CR.
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Save your changes and quit the text editor to commit your changes.
Verification
Confirm that the CNO created the
NetworkAttachmentDefinitionobject by running the following command. There might be a delay before the CNO creates the object.oc get network-attachment-definitions -n <namespace>
$ oc get network-attachment-definitions -n <namespace>Copy to Clipboard Copied! Toggle word wrap Toggle overflow where:
<namespace>- Specifies the namespace for the network attachment that you added to the CNO configuration.
Example output
NAME AGE test-network-1 14m
NAME AGE test-network-1 14mCopy to Clipboard Copied! Toggle word wrap Toggle overflow
21.2.7. Creating an additional network attachment by applying a YAML manifest Copy linkLink copied to clipboard!
Prerequisites
-
Install the OpenShift CLI (
oc). -
Log in as a user with
cluster-adminprivileges.
Procedure
Create a YAML file with your additional network configuration, such as in the following example:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow To create the additional network, enter the following command:
oc apply -f <file>.yaml
$ oc apply -f <file>.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow where:
<file>- Specifies the name of the file contained the YAML manifest.
21.3. About virtual routing and forwarding Copy linkLink copied to clipboard!
21.3.1. About virtual routing and forwarding Copy linkLink copied to clipboard!
Virtual routing and forwarding (VRF) devices combined with IP rules provide the ability to create virtual routing and forwarding domains. VRF reduces the number of permissions needed by CNF, and provides increased visibility of the network topology of secondary networks. VRF is used to provide multi-tenancy functionality, for example, where each tenant has its own unique routing tables and requires different default gateways.
Processes can bind a socket to the VRF device. Packets through the binded socket use the routing table associated with the VRF device. An important feature of VRF is that it impacts only OSI model layer 3 traffic and above so L2 tools, such as LLDP, are not affected. This allows higher priority IP rules such as policy based routing to take precedence over the VRF device rules directing specific traffic.
21.3.1.1. Benefits of secondary networks for pods for telecommunications operators Copy linkLink copied to clipboard!
In telecommunications use cases, each CNF can potentially be connected to multiple different networks sharing the same address space. These secondary networks can potentially conflict with the cluster’s main network CIDR. Using the CNI VRF plugin, network functions can be connected to different customers' infrastructure using the same IP address, keeping different customers isolated. IP addresses are overlapped with OpenShift Container Platform IP space. The CNI VRF plugin also reduces the number of permissions needed by CNF and increases the visibility of network topologies of secondary networks.
21.4. Configuring multi-network policy Copy linkLink copied to clipboard!
As a cluster administrator, you can configure multi-network for additional networks. You can specify multi-network policy for SR-IOV and macvlan additional networks. Macvlan additional networks are fully supported. Other types of additional networks, such as ipvlan, are not supported.
Support for configuring multi-network policies for SR-IOV additional networks is a Technology Preview feature and is only supported with kernel network interface cards (NICs). SR-IOV is not supported for Data Plane Development Kit (DPDK) applications.
For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.
Configured network policies are ignored in IPv6 networks.
21.4.1. Differences between multi-network policy and network policy Copy linkLink copied to clipboard!
Although the MultiNetworkPolicy API implements the NetworkPolicy API, there are several important differences:
You must use the
MultiNetworkPolicyAPI:apiVersion: k8s.cni.cncf.io/v1beta1 kind: MultiNetworkPolicy
apiVersion: k8s.cni.cncf.io/v1beta1 kind: MultiNetworkPolicyCopy to Clipboard Copied! Toggle word wrap Toggle overflow -
You must use the
multi-networkpolicyresource name when using the CLI to interact with multi-network policies. For example, you can view a multi-network policy object with theoc get multi-networkpolicy <name>command where<name>is the name of a multi-network policy. You can use the
k8s.v1.cni.cncf.io/policy-forannotation on aMultiNetworkPolicyobject to point to aNetworkAttachmentDefinition(NAD) custom resource (CR). The NAD CR defines the network to which the policy applies.Example multi-network policy that includes the
k8s.v1.cni.cncf.io/policy-forannotationapiVersion: k8s.cni.cncf.io/v1beta1 kind: MultiNetworkPolicy metadata: annotations: k8s.v1.cni.cncf.io/policy-for:<namespace_name>/<network_name>apiVersion: k8s.cni.cncf.io/v1beta1 kind: MultiNetworkPolicy metadata: annotations: k8s.v1.cni.cncf.io/policy-for:<namespace_name>/<network_name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow where:
<namespace_name>- Specifies the namespace name.
<network_name>- Specifies the name of a network attachment definition.
21.4.2. Enabling multi-network policy for the cluster Copy linkLink copied to clipboard!
As a cluster administrator, you can enable multi-network policy support on your cluster.
Prerequisites
-
Install the OpenShift CLI (
oc). -
Log in to the cluster with a user with
cluster-adminprivileges.
Procedure
Create the
multinetwork-enable-patch.yamlfile with the following YAML:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Configure the cluster to enable multi-network policy:
oc patch network.operator.openshift.io cluster --type=merge --patch-file=multinetwork-enable-patch.yaml
$ oc patch network.operator.openshift.io cluster --type=merge --patch-file=multinetwork-enable-patch.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
network.operator.openshift.io/cluster patched
network.operator.openshift.io/cluster patchedCopy to Clipboard Copied! Toggle word wrap Toggle overflow
21.4.3. Working with multi-network policy Copy linkLink copied to clipboard!
As a cluster administrator, you can create, edit, view, and delete multi-network policies.
21.4.3.1. Prerequisites Copy linkLink copied to clipboard!
- You have enabled multi-network policy support for your cluster.
21.4.3.2. Creating a multi-network policy using the CLI Copy linkLink copied to clipboard!
To define granular rules describing ingress or egress network traffic allowed for namespaces in your cluster, you can create a multi-network policy.
Prerequisites
-
Your cluster uses a network plugin that supports
NetworkPolicyobjects, such as the OpenShift SDN network provider withmode: NetworkPolicyset. This mode is the default for OpenShift SDN. -
You installed the OpenShift CLI (
oc). -
You are logged in to the cluster with a user with
cluster-adminprivileges. - You are working in the namespace that the multi-network policy applies to.
Procedure
Create a policy rule:
Create a
<policy_name>.yamlfile:touch <policy_name>.yaml
$ touch <policy_name>.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow where:
<policy_name>- Specifies the multi-network policy file name.
Define a multi-network policy in the file that you just created, such as in the following examples:
Deny ingress from all pods in all namespaces
This is a fundamental policy, blocking all cross-pod networking other than cross-pod traffic allowed by the configuration of other Network Policies.
Copy to Clipboard Copied! Toggle word wrap Toggle overflow where:
<network_name>- Specifies the name of a network attachment definition.
Allow ingress from all pods in the same namespace
Copy to Clipboard Copied! Toggle word wrap Toggle overflow where:
<network_name>- Specifies the name of a network attachment definition.
Allow ingress traffic to one pod from a particular namespace
This policy allows traffic to pods labelled
pod-afrom pods running innamespace-y.Copy to Clipboard Copied! Toggle word wrap Toggle overflow where:
<network_name>- Specifies the name of a network attachment definition.
Restrict traffic to a service
This policy when applied ensures every pod with both labels
app=bookstoreandrole=apican only be accessed by pods with labelapp=bookstore. In this example the application could be a REST API server, marked with labelsapp=bookstoreandrole=api.This example addresses the following use cases:
- Restricting the traffic to a service to only the other microservices that need to use it.
Restricting the connections to a database to only permit the application using it.
Copy to Clipboard Copied! Toggle word wrap Toggle overflow where:
<network_name>- Specifies the name of a network attachment definition.
To create the multi-network policy object, enter the following command:
oc apply -f <policy_name>.yaml -n <namespace>
$ oc apply -f <policy_name>.yaml -n <namespace>Copy to Clipboard Copied! Toggle word wrap Toggle overflow where:
<policy_name>- Specifies the multi-network policy file name.
<namespace>- Optional: Specifies the namespace if the object is defined in a different namespace than the current namespace.
Example output
multinetworkpolicy.k8s.cni.cncf.io/deny-by-default created
multinetworkpolicy.k8s.cni.cncf.io/deny-by-default createdCopy to Clipboard Copied! Toggle word wrap Toggle overflow
If you log in to the web console with cluster-admin privileges, you have a choice of creating a network policy in any namespace in the cluster directly in YAML or from a form in the web console.
21.4.3.3. Editing a multi-network policy Copy linkLink copied to clipboard!
You can edit a multi-network policy in a namespace.
Prerequisites
-
Your cluster uses a network plugin that supports
NetworkPolicyobjects, such as the OpenShift SDN network provider withmode: NetworkPolicyset. This mode is the default for OpenShift SDN. -
You installed the OpenShift CLI (
oc). -
You are logged in to the cluster with a user with
cluster-adminprivileges. - You are working in the namespace where the multi-network policy exists.
Procedure
Optional: To list the multi-network policy objects in a namespace, enter the following command:
oc get multi-networkpolicy
$ oc get multi-networkpolicyCopy to Clipboard Copied! Toggle word wrap Toggle overflow where:
<namespace>- Optional: Specifies the namespace if the object is defined in a different namespace than the current namespace.
Edit the multi-network policy object.
If you saved the multi-network policy definition in a file, edit the file and make any necessary changes, and then enter the following command.
oc apply -n <namespace> -f <policy_file>.yaml
$ oc apply -n <namespace> -f <policy_file>.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow where:
<namespace>- Optional: Specifies the namespace if the object is defined in a different namespace than the current namespace.
<policy_file>- Specifies the name of the file containing the network policy.
If you need to update the multi-network policy object directly, enter the following command:
oc edit multi-networkpolicy <policy_name> -n <namespace>
$ oc edit multi-networkpolicy <policy_name> -n <namespace>Copy to Clipboard Copied! Toggle word wrap Toggle overflow where:
<policy_name>- Specifies the name of the network policy.
<namespace>- Optional: Specifies the namespace if the object is defined in a different namespace than the current namespace.
Confirm that the multi-network policy object is updated.
oc describe multi-networkpolicy <policy_name> -n <namespace>
$ oc describe multi-networkpolicy <policy_name> -n <namespace>Copy to Clipboard Copied! Toggle word wrap Toggle overflow where:
<policy_name>- Specifies the name of the multi-network policy.
<namespace>- Optional: Specifies the namespace if the object is defined in a different namespace than the current namespace.
If you log in to the web console with cluster-admin privileges, you have a choice of editing a network policy in any namespace in the cluster directly in YAML or from the policy in the web console through the Actions menu.
21.4.3.4. Viewing multi-network policies using the CLI Copy linkLink copied to clipboard!
You can examine the multi-network policies in a namespace.
Prerequisites
-
You installed the OpenShift CLI (
oc). -
You are logged in to the cluster with a user with
cluster-adminprivileges. - You are working in the namespace where the multi-network policy exists.
Procedure
List multi-network policies in a namespace:
To view multi-network policy objects defined in a namespace, enter the following command:
oc get multi-networkpolicy
$ oc get multi-networkpolicyCopy to Clipboard Copied! Toggle word wrap Toggle overflow Optional: To examine a specific multi-network policy, enter the following command:
oc describe multi-networkpolicy <policy_name> -n <namespace>
$ oc describe multi-networkpolicy <policy_name> -n <namespace>Copy to Clipboard Copied! Toggle word wrap Toggle overflow where:
<policy_name>- Specifies the name of the multi-network policy to inspect.
<namespace>- Optional: Specifies the namespace if the object is defined in a different namespace than the current namespace.
If you log in to the web console with cluster-admin privileges, you have a choice of viewing a network policy in any namespace in the cluster directly in YAML or from a form in the web console.
21.4.3.5. Deleting a multi-network policy using the CLI Copy linkLink copied to clipboard!
You can delete a multi-network policy in a namespace.
Prerequisites
-
Your cluster uses a network plugin that supports
NetworkPolicyobjects, such as the OpenShift SDN network provider withmode: NetworkPolicyset. This mode is the default for OpenShift SDN. -
You installed the OpenShift CLI (
oc). -
You are logged in to the cluster with a user with
cluster-adminprivileges. - You are working in the namespace where the multi-network policy exists.
Procedure
To delete a multi-network policy object, enter the following command:
oc delete multi-networkpolicy <policy_name> -n <namespace>
$ oc delete multi-networkpolicy <policy_name> -n <namespace>Copy to Clipboard Copied! Toggle word wrap Toggle overflow where:
<policy_name>- Specifies the name of the multi-network policy.
<namespace>- Optional: Specifies the namespace if the object is defined in a different namespace than the current namespace.
Example output
multinetworkpolicy.k8s.cni.cncf.io/default-deny deleted
multinetworkpolicy.k8s.cni.cncf.io/default-deny deletedCopy to Clipboard Copied! Toggle word wrap Toggle overflow
If you log in to the web console with cluster-admin privileges, you have a choice of deleting a network policy in any namespace in the cluster directly in YAML or from the policy in the web console through the Actions menu.
21.4.3.6. Creating a default deny all multi-network policy Copy linkLink copied to clipboard!
This is a fundamental policy, blocking all cross-pod networking other than network traffic allowed by the configuration of other deployed network policies. This procedure enforces a default deny-by-default policy.
If you log in with a user with the cluster-admin role, then you can create a network policy in any namespace in the cluster.
Prerequisites
-
Your cluster uses a network plugin that supports
NetworkPolicyobjects, such as the OpenShift SDN network provider withmode: NetworkPolicyset. This mode is the default for OpenShift SDN. -
You installed the OpenShift CLI (
oc). -
You are logged in to the cluster with a user with
cluster-adminprivileges. - You are working in the namespace that the multi-network policy applies to.
Procedure
Create the following YAML that defines a
deny-by-defaultpolicy to deny ingress from all pods in all namespaces. Save the YAML in thedeny-by-default.yamlfile:Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
namespace: defaultdeploys this policy to thedefaultnamespace.- 2
network_name: specifies the name of a network attachment definition.- 3
podSelector:is empty, this means it matches all the pods. Therefore, the policy applies to all pods in the default namespace.- 4
- There are no
ingressrules specified. This causes incoming traffic to be dropped to all pods.
Apply the policy by entering the following command:
oc apply -f deny-by-default.yaml
$ oc apply -f deny-by-default.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
multinetworkpolicy.k8s.cni.cncf.io/deny-by-default created
multinetworkpolicy.k8s.cni.cncf.io/deny-by-default createdCopy to Clipboard Copied! Toggle word wrap Toggle overflow
21.4.3.7. Creating a multi-network policy to allow traffic from external clients Copy linkLink copied to clipboard!
With the deny-by-default policy in place you can proceed to configure a policy that allows traffic from external clients to a pod with the label app=web.
If you log in with a user with the cluster-admin role, then you can create a network policy in any namespace in the cluster.
Follow this procedure to configure a policy that allows external service from the public Internet directly or by using a Load Balancer to access the pod. Traffic is only allowed to a pod with the label app=web.
Prerequisites
-
Your cluster uses a network plugin that supports
NetworkPolicyobjects, such as the OpenShift SDN network provider withmode: NetworkPolicyset. This mode is the default for OpenShift SDN. -
You installed the OpenShift CLI (
oc). -
You are logged in to the cluster with a user with
cluster-adminprivileges. - You are working in the namespace that the multi-network policy applies to.
Procedure
Create a policy that allows traffic from the public Internet directly or by using a load balancer to access the pod. Save the YAML in the
web-allow-external.yamlfile:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Apply the policy by entering the following command:
oc apply -f web-allow-external.yaml
$ oc apply -f web-allow-external.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
multinetworkpolicy.k8s.cni.cncf.io/web-allow-external created
multinetworkpolicy.k8s.cni.cncf.io/web-allow-external createdCopy to Clipboard Copied! Toggle word wrap Toggle overflow
This policy allows traffic from all resources, including external traffic as illustrated in the following diagram:
21.4.3.8. Creating a multi-network policy allowing traffic to an application from all namespaces Copy linkLink copied to clipboard!
If you log in with a user with the cluster-admin role, then you can create a network policy in any namespace in the cluster.
Follow this procedure to configure a policy that allows traffic from all pods in all namespaces to a particular application.
Prerequisites
-
Your cluster uses a network plugin that supports
NetworkPolicyobjects, such as the OpenShift SDN network provider withmode: NetworkPolicyset. This mode is the default for OpenShift SDN. -
You installed the OpenShift CLI (
oc). -
You are logged in to the cluster with a user with
cluster-adminprivileges. - You are working in the namespace that the multi-network policy applies to.
Procedure
Create a policy that allows traffic from all pods in all namespaces to a particular application. Save the YAML in the
web-allow-all-namespaces.yamlfile:Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteBy default, if you omit specifying a
namespaceSelectorit does not select any namespaces, which means the policy allows traffic only from the namespace the network policy is deployed to.Apply the policy by entering the following command:
oc apply -f web-allow-all-namespaces.yaml
$ oc apply -f web-allow-all-namespaces.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
multinetworkpolicy.k8s.cni.cncf.io/web-allow-all-namespaces created
multinetworkpolicy.k8s.cni.cncf.io/web-allow-all-namespaces createdCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Verification
Start a web service in the
defaultnamespace by entering the following command:oc run web --namespace=default --image=nginx --labels="app=web" --expose --port=80
$ oc run web --namespace=default --image=nginx --labels="app=web" --expose --port=80Copy to Clipboard Copied! Toggle word wrap Toggle overflow Run the following command to deploy an
alpineimage in thesecondarynamespace and to start a shell:oc run test-$RANDOM --namespace=secondary --rm -i -t --image=alpine -- sh
$ oc run test-$RANDOM --namespace=secondary --rm -i -t --image=alpine -- shCopy to Clipboard Copied! Toggle word wrap Toggle overflow Run the following command in the shell and observe that the request is allowed:
wget -qO- --timeout=2 http://web.default
# wget -qO- --timeout=2 http://web.defaultCopy to Clipboard Copied! Toggle word wrap Toggle overflow Expected output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
21.4.3.9. Creating a multi-network policy allowing traffic to an application from a namespace Copy linkLink copied to clipboard!
If you log in with a user with the cluster-admin role, then you can create a network policy in any namespace in the cluster.
Follow this procedure to configure a policy that allows traffic to a pod with the label app=web from a particular namespace. You might want to do this to:
- Restrict traffic to a production database only to namespaces where production workloads are deployed.
- Enable monitoring tools deployed to a particular namespace to scrape metrics from the current namespace.
Prerequisites
-
Your cluster uses a network plugin that supports
NetworkPolicyobjects, such as the OpenShift SDN network provider withmode: NetworkPolicyset. This mode is the default for OpenShift SDN. -
You installed the OpenShift CLI (
oc). -
You are logged in to the cluster with a user with
cluster-adminprivileges. - You are working in the namespace that the multi-network policy applies to.
Procedure
Create a policy that allows traffic from all pods in a particular namespaces with a label
purpose=production. Save the YAML in theweb-allow-prod.yamlfile:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Apply the policy by entering the following command:
oc apply -f web-allow-prod.yaml
$ oc apply -f web-allow-prod.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
multinetworkpolicy.k8s.cni.cncf.io/web-allow-prod created
multinetworkpolicy.k8s.cni.cncf.io/web-allow-prod createdCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Verification
Start a web service in the
defaultnamespace by entering the following command:oc run web --namespace=default --image=nginx --labels="app=web" --expose --port=80
$ oc run web --namespace=default --image=nginx --labels="app=web" --expose --port=80Copy to Clipboard Copied! Toggle word wrap Toggle overflow Run the following command to create the
prodnamespace:oc create namespace prod
$ oc create namespace prodCopy to Clipboard Copied! Toggle word wrap Toggle overflow Run the following command to label the
prodnamespace:oc label namespace/prod purpose=production
$ oc label namespace/prod purpose=productionCopy to Clipboard Copied! Toggle word wrap Toggle overflow Run the following command to create the
devnamespace:oc create namespace dev
$ oc create namespace devCopy to Clipboard Copied! Toggle word wrap Toggle overflow Run the following command to label the
devnamespace:oc label namespace/dev purpose=testing
$ oc label namespace/dev purpose=testingCopy to Clipboard Copied! Toggle word wrap Toggle overflow Run the following command to deploy an
alpineimage in thedevnamespace and to start a shell:oc run test-$RANDOM --namespace=dev --rm -i -t --image=alpine -- sh
$ oc run test-$RANDOM --namespace=dev --rm -i -t --image=alpine -- shCopy to Clipboard Copied! Toggle word wrap Toggle overflow Run the following command in the shell and observe that the request is blocked:
wget -qO- --timeout=2 http://web.default
# wget -qO- --timeout=2 http://web.defaultCopy to Clipboard Copied! Toggle word wrap Toggle overflow Expected output
wget: download timed out
wget: download timed outCopy to Clipboard Copied! Toggle word wrap Toggle overflow Run the following command to deploy an
alpineimage in theprodnamespace and start a shell:oc run test-$RANDOM --namespace=prod --rm -i -t --image=alpine -- sh
$ oc run test-$RANDOM --namespace=prod --rm -i -t --image=alpine -- shCopy to Clipboard Copied! Toggle word wrap Toggle overflow Run the following command in the shell and observe that the request is allowed:
wget -qO- --timeout=2 http://web.default
# wget -qO- --timeout=2 http://web.defaultCopy to Clipboard Copied! Toggle word wrap Toggle overflow Expected output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
21.5. Attaching a pod to an additional network Copy linkLink copied to clipboard!
As a cluster user you can attach a pod to an additional network.
21.5.1. Adding a pod to an additional network Copy linkLink copied to clipboard!
You can add a pod to an additional network. The pod continues to send normal cluster-related network traffic over the default network.
When a pod is created additional networks are attached to it. However, if a pod already exists, you cannot attach additional networks to it.
The pod must be in the same namespace as the additional network.
Prerequisites
-
Install the OpenShift CLI (
oc). - Log in to the cluster.
Procedure
Add an annotation to the
Podobject. Only one of the following annotation formats can be used:To attach an additional network without any customization, add an annotation with the following format. Replace
<network>with the name of the additional network to associate with the pod:metadata: annotations: k8s.v1.cni.cncf.io/networks: <network>[,<network>,...]metadata: annotations: k8s.v1.cni.cncf.io/networks: <network>[,<network>,...]1 Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- To specify more than one additional network, separate each network with a comma. Do not include whitespace between the comma. If you specify the same additional network multiple times, that pod will have multiple network interfaces attached to that network.
To attach an additional network with customizations, add an annotation with the following format:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
To create the pod, enter the following command. Replace
<name>with the name of the pod.oc create -f <name>.yaml
$ oc create -f <name>.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Optional: To Confirm that the annotation exists in the
PodCR, enter the following command, replacing<name>with the name of the pod.oc get pod <name> -o yaml
$ oc get pod <name> -o yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow In the following example, the
example-podpod is attached to thenet1additional network:Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- The
k8s.v1.cni.cncf.io/networks-statusparameter is a JSON array of objects. Each object describes the status of an additional network attached to the pod. The annotation value is stored as a plain text value.
21.5.1.1. Specifying pod-specific addressing and routing options Copy linkLink copied to clipboard!
When attaching a pod to an additional network, you may want to specify further properties about that network in a particular pod. This allows you to change some aspects of routing, as well as specify static IP addresses and MAC addresses. To accomplish this, you can use the JSON formatted annotations.
Prerequisites
- The pod must be in the same namespace as the additional network.
-
Install the OpenShift CLI (
oc). - You must log in to the cluster.
Procedure
To add a pod to an additional network while specifying addressing and/or routing options, complete the following steps:
Edit the
Podresource definition. If you are editing an existingPodresource, run the following command to edit its definition in the default editor. Replace<name>with the name of thePodresource to edit.oc edit pod <name>
$ oc edit pod <name>Copy to Clipboard Copied! Toggle word wrap Toggle overflow In the
Podresource definition, add thek8s.v1.cni.cncf.io/networksparameter to the podmetadatamapping. Thek8s.v1.cni.cncf.io/networksaccepts a JSON string of a list of objects that reference the name ofNetworkAttachmentDefinitioncustom resource (CR) names in addition to specifying additional properties.metadata: annotations: k8s.v1.cni.cncf.io/networks: '[<network>[,<network>,...]]'metadata: annotations: k8s.v1.cni.cncf.io/networks: '[<network>[,<network>,...]]'1 Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Replace
<network>with a JSON object as shown in the following examples. The single quotes are required.
In the following example the annotation specifies which network attachment will have the default route, using the
default-routeparameter.Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- The
namekey is the name of the additional network to associate with the pod. - 2
- The
default-routekey specifies a value of a gateway for traffic to be routed over if no other routing entry is present in the routing table. If more than onedefault-routekey is specified, this will cause the pod to fail to become active.
The default route will cause any traffic that is not specified in other routes to be routed to the gateway.
Setting the default route to an interface other than the default network interface for OpenShift Container Platform may cause traffic that is anticipated for pod-to-pod traffic to be routed over another interface.
To verify the routing properties of a pod, the oc command may be used to execute the ip command within a pod.
oc exec -it <pod_name> -- ip route
$ oc exec -it <pod_name> -- ip route
You may also reference the pod’s k8s.v1.cni.cncf.io/networks-status to see which additional network has been assigned the default route, by the presence of the default-route key in the JSON-formatted list of objects.
To set a static IP address or MAC address for a pod you can use the JSON formatted annotations. This requires you create networks that specifically allow for this functionality. This can be specified in a rawCNIConfig for the CNO.
Edit the CNO CR by running the following command:
oc edit networks.operator.openshift.io cluster
$ oc edit networks.operator.openshift.io clusterCopy to Clipboard Copied! Toggle word wrap Toggle overflow
The following YAML describes the configuration parameters for the CNO:
Cluster Network Operator YAML configuration
- 1
- Specify a name for the additional network attachment that you are creating. The name must be unique within the specified
namespace. - 2
- Specify the namespace to create the network attachment in. If you do not specify a value, then the
defaultnamespace is used. - 3
- Specify the CNI plugin configuration in JSON format, which is based on the following template.
The following object describes the configuration parameters for utilizing static MAC address and IP address using the macvlan CNI plugin:
macvlan CNI plugin JSON configuration object using static IP and MAC address
- 1
- Specifies the name for the additional network attachment to create. The name must be unique within the specified
namespace. - 2
- Specifies an array of CNI plugin configurations. The first object specifies a macvlan plugin configuration and the second object specifies a tuning plugin configuration.
- 3
- Specifies that a request is made to enable the static IP address functionality of the CNI plugin runtime configuration capabilities.
- 4
- Specifies the interface that the macvlan plugin uses.
- 5
- Specifies that a request is made to enable the static MAC address functionality of a CNI plugin.
The above network attachment can be referenced in a JSON formatted annotation, along with keys to specify which static IP and MAC address will be assigned to a given pod.
Edit the pod with:
oc edit pod <name>
$ oc edit pod <name>
macvlan CNI plugin JSON configuration object using static IP and MAC address
Static IP addresses and MAC addresses do not have to be used at the same time, you may use them individually, or together.
To verify the IP address and MAC properties of a pod with additional networks, use the oc command to execute the ip command within a pod.
oc exec -it <pod_name> -- ip a
$ oc exec -it <pod_name> -- ip a
21.6. Removing a pod from an additional network Copy linkLink copied to clipboard!
As a cluster user you can remove a pod from an additional network.
21.6.1. Removing a pod from an additional network Copy linkLink copied to clipboard!
You can remove a pod from an additional network only by deleting the pod.
Prerequisites
- An additional network is attached to the pod.
-
Install the OpenShift CLI (
oc). - Log in to the cluster.
Procedure
To delete the pod, enter the following command:
oc delete pod <name> -n <namespace>
$ oc delete pod <name> -n <namespace>Copy to Clipboard Copied! Toggle word wrap Toggle overflow -
<name>is the name of the pod. -
<namespace>is the namespace that contains the pod.
-
21.7. Editing an additional network Copy linkLink copied to clipboard!
As a cluster administrator you can modify the configuration for an existing additional network.
21.7.1. Modifying an additional network attachment definition Copy linkLink copied to clipboard!
As a cluster administrator, you can make changes to an existing additional network. Any existing pods attached to the additional network will not be updated.
Prerequisites
- You have configured an additional network for your cluster.
-
Install the OpenShift CLI (
oc). -
Log in as a user with
cluster-adminprivileges.
Procedure
To edit an additional network for your cluster, complete the following steps:
Run the following command to edit the Cluster Network Operator (CNO) CR in your default text editor:
oc edit networks.operator.openshift.io cluster
$ oc edit networks.operator.openshift.io clusterCopy to Clipboard Copied! Toggle word wrap Toggle overflow -
In the
additionalNetworkscollection, update the additional network with your changes. - Save your changes and quit the text editor to commit your changes.
Optional: Confirm that the CNO updated the
NetworkAttachmentDefinitionobject by running the following command. Replace<network-name>with the name of the additional network to display. There might be a delay before the CNO updates theNetworkAttachmentDefinitionobject to reflect your changes.oc get network-attachment-definitions <network-name> -o yaml
$ oc get network-attachment-definitions <network-name> -o yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow For example, the following console output displays a
NetworkAttachmentDefinitionobject that is namednet1:oc get network-attachment-definitions net1 -o go-template='{{printf "%s\n" .spec.config}}' { "cniVersion": "0.3.1", "type": "macvlan", "master": "ens5", "mode": "bridge", "ipam": {"type":"static","routes":[{"dst":"0.0.0.0/0","gw":"10.128.2.1"}],"addresses":[{"address":"10.128.2.100/23","gateway":"10.128.2.1"}],"dns":{"nameservers":["172.30.0.10"],"domain":"us-west-2.compute.internal","search":["us-west-2.compute.internal"]}} }$ oc get network-attachment-definitions net1 -o go-template='{{printf "%s\n" .spec.config}}' { "cniVersion": "0.3.1", "type": "macvlan", "master": "ens5", "mode": "bridge", "ipam": {"type":"static","routes":[{"dst":"0.0.0.0/0","gw":"10.128.2.1"}],"addresses":[{"address":"10.128.2.100/23","gateway":"10.128.2.1"}],"dns":{"nameservers":["172.30.0.10"],"domain":"us-west-2.compute.internal","search":["us-west-2.compute.internal"]}} }Copy to Clipboard Copied! Toggle word wrap Toggle overflow
21.8. Removing an additional network Copy linkLink copied to clipboard!
As a cluster administrator you can remove an additional network attachment.
21.8.1. Removing an additional network attachment definition Copy linkLink copied to clipboard!
As a cluster administrator, you can remove an additional network from your OpenShift Container Platform cluster. The additional network is not removed from any pods it is attached to.
Prerequisites
-
Install the OpenShift CLI (
oc). -
Log in as a user with
cluster-adminprivileges.
Procedure
To remove an additional network from your cluster, complete the following steps:
Edit the Cluster Network Operator (CNO) in your default text editor by running the following command:
oc edit networks.operator.openshift.io cluster
$ oc edit networks.operator.openshift.io clusterCopy to Clipboard Copied! Toggle word wrap Toggle overflow Modify the CR by removing the configuration from the
additionalNetworkscollection for the network attachment definition you are removing.Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- If you are removing the configuration mapping for the only additional network attachment definition in the
additionalNetworkscollection, you must specify an empty collection.
- Save your changes and quit the text editor to commit your changes.
Optional: Confirm that the additional network CR was deleted by running the following command:
oc get network-attachment-definition --all-namespaces
$ oc get network-attachment-definition --all-namespacesCopy to Clipboard Copied! Toggle word wrap Toggle overflow
21.9. Assigning a secondary network to a VRF Copy linkLink copied to clipboard!
As a cluster administrator, you can configure an additional network for a virtual routing and forwarding (VRF) domain by using the CNI VRF plugin. The virtual network that this plugin creates is associated with the physical interface that you specify.
Using a secondary network with a VRF instance has the following advantages:
- Workload isolation
- Isolate workload traffic by configuring a VRF instance for the additional network.
- Improved security
- Enable improved security through isolated network paths in the VRF domain.
- Multi-tenancy support
- Support multi-tenancy through network segmentation with a unique routing table in the VRF domain for each tenant.
Applications that use VRFs must bind to a specific device. The common usage is to use the SO_BINDTODEVICE option for a socket. The SO_BINDTODEVICE option binds the socket to the device that is specified in the passed interface name, for example, eth1. To use the SO_BINDTODEVICE option, the application must have CAP_NET_RAW capabilities.
Using a VRF through the ip vrf exec command is not supported in OpenShift Container Platform pods. To use VRF, bind applications directly to the VRF interface.
21.9.1. Creating an additional network attachment with the CNI VRF plugin Copy linkLink copied to clipboard!
The Cluster Network Operator (CNO) manages additional network definitions. When you specify an additional network to create, the CNO creates the NetworkAttachmentDefinition custom resource (CR) automatically.
Do not edit the NetworkAttachmentDefinition CRs that the Cluster Network Operator manages. Doing so might disrupt network traffic on your additional network.
To create an additional network attachment with the CNI VRF plugin, perform the following procedure.
Prerequisites
- Install the OpenShift Container Platform CLI (oc).
- Log in to the OpenShift cluster as a user with cluster-admin privileges.
Procedure
Create the
Networkcustom resource (CR) for the additional network attachment and insert therawCNIConfigconfiguration for the additional network, as in the following example CR. Save the YAML as the fileadditional-network-attachment.yaml.Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
pluginsmust be a list. The first item in the list must be the secondary network underpinning the VRF network. The second item in the list is the VRF plugin configuration.- 2
typemust be set tovrf.- 3
vrfnameis the name of the VRF that the interface is assigned to. If it does not exist in the pod, it is created.- 4
- Optional.
tableis the routing table ID. By default, thetableidparameter is used. If it is not specified, the CNI assigns a free routing table ID to the VRF.
NoteVRF functions correctly only when the resource is of type
netdevice.Create the
Networkresource:oc create -f additional-network-attachment.yaml
$ oc create -f additional-network-attachment.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Confirm that the CNO created the
NetworkAttachmentDefinitionCR by running the following command. Replace<namespace>with the namespace that you specified when configuring the network attachment, for example,additional-network-1.oc get network-attachment-definitions -n <namespace>
$ oc get network-attachment-definitions -n <namespace>Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
NAME AGE additional-network-1 14m
NAME AGE additional-network-1 14mCopy to Clipboard Copied! Toggle word wrap Toggle overflow NoteThere might be a delay before the CNO creates the CR.
Verification
Create a pod and assign it to the additional network with the VRF instance:
Create a YAML file that defines the
Podresource:Example
pod-additional-net.yamlfileCopy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Specify the name of the additional network with the VRF instance.
Create the
Podresource by running the following command:oc create -f pod-additional-net.yaml
$ oc create -f pod-additional-net.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
pod/test-pod created
pod/test-pod createdCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Verify that the pod network attachment is connected to the VRF additional network. Start a remote session with the pod and run the following command:
ip vrf show
$ ip vrf showCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Name Table ----------------------- vrf-1 1001
Name Table ----------------------- vrf-1 1001Copy to Clipboard Copied! Toggle word wrap Toggle overflow Confirm that the VRF interface is the controller for the additional interface:
ip link
$ ip linkCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
5: net1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master red state UP mode
5: net1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master red state UP modeCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Chapter 22. Hardware networks Copy linkLink copied to clipboard!
22.1. About Single Root I/O Virtualization (SR-IOV) hardware networks Copy linkLink copied to clipboard!
The Single Root I/O Virtualization (SR-IOV) specification is a standard for a type of PCI device assignment that can share a single device with multiple pods.
SR-IOV can segment a compliant network device, recognized on the host node as a physical function (PF), into multiple virtual functions (VFs). The VF is used like any other network device. The SR-IOV network device driver for the device determines how the VF is exposed in the container:
-
netdevicedriver: A regular kernel network device in thenetnsof the container -
vfio-pcidriver: A character device mounted in the container
You can use SR-IOV network devices with additional networks on your OpenShift Container Platform cluster installed on bare metal or Red Hat OpenStack Platform (RHOSP) infrastructure for applications that require high bandwidth or low latency.
You can configure multi-network policies for SR-IOV networks. The support for this is technology preview and SR-IOV additional networks are only supported with kernel NICs. They are not supported for Data Plane Development Kit (DPDK) applications.
Creating multi-network policies on SR-IOV networks might not deliver the same performance to applications compared to SR-IOV networks without a multi-network policy configured.
Multi-network policies for SR-IOV network is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.
For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.
You can enable SR-IOV on a node by using the following command:
oc label node <node_name> feature.node.kubernetes.io/network-sriov.capable="true"
$ oc label node <node_name> feature.node.kubernetes.io/network-sriov.capable="true"
22.1.1. Components that manage SR-IOV network devices Copy linkLink copied to clipboard!
The SR-IOV Network Operator creates and manages the components of the SR-IOV stack. It performs the following functions:
- Orchestrates discovery and management of SR-IOV network devices
-
Generates
NetworkAttachmentDefinitioncustom resources for the SR-IOV Container Network Interface (CNI) - Creates and updates the configuration of the SR-IOV network device plugin
-
Creates node specific
SriovNetworkNodeStatecustom resources -
Updates the
spec.interfacesfield in eachSriovNetworkNodeStatecustom resource
The Operator provisions the following components:
- SR-IOV network configuration daemon
- A daemon set that is deployed on worker nodes when the SR-IOV Network Operator starts. The daemon is responsible for discovering and initializing SR-IOV network devices in the cluster.
- SR-IOV Network Operator webhook
- A dynamic admission controller webhook that validates the Operator custom resource and sets appropriate default values for unset fields.
- SR-IOV Network resources injector
-
A dynamic admission controller webhook that provides functionality for patching Kubernetes pod specifications with requests and limits for custom network resources such as SR-IOV VFs. The SR-IOV network resources injector adds the
resourcefield to only the first container in a pod automatically. - SR-IOV network device plugin
- A device plugin that discovers, advertises, and allocates SR-IOV network virtual function (VF) resources. Device plugins are used in Kubernetes to enable the use of limited resources, typically in physical devices. Device plugins give the Kubernetes scheduler awareness of resource availability, so that the scheduler can schedule pods on nodes with sufficient resources.
- SR-IOV CNI plugin
- A CNI plugin that attaches VF interfaces allocated from the SR-IOV network device plugin directly into a pod.
- SR-IOV InfiniBand CNI plugin
- A CNI plugin that attaches InfiniBand (IB) VF interfaces allocated from the SR-IOV network device plugin directly into a pod.
The SR-IOV Network resources injector and SR-IOV Network Operator webhook are enabled by default and can be disabled by editing the default SriovOperatorConfig CR. Use caution when disabling the SR-IOV Network Operator Admission Controller webhook. You can disable the webhook under specific circumstances, such as troubleshooting, or if you want to use unsupported devices.
22.1.1.1. Supported platforms Copy linkLink copied to clipboard!
The SR-IOV Network Operator is supported on the following platforms:
- Bare metal
- Red Hat OpenStack Platform (RHOSP)
22.1.1.2. Supported devices Copy linkLink copied to clipboard!
OpenShift Container Platform supports the following network interface controllers:
| Manufacturer | Model | Vendor ID | Device ID |
|---|---|---|---|
| Broadcom | BCM57414 | 14e4 | 16d7 |
| Broadcom | BCM57508 | 14e4 | 1750 |
| Broadcom | BCM57504 | 14e4 | 1751 |
| Intel | X710 | 8086 | 1572 |
| Intel | XL710 | 8086 | 1583 |
| Intel | X710 Base T | 8086 | 15ff |
| Intel | XXV710 | 8086 | 158b |
| Intel | E810-CQDA2 | 8086 | 1592 |
| Intel | E810-2CQDA2 | 8086 | 1592 |
| Intel | E810-XXVDA2 | 8086 | 159b |
| Intel | E810-XXVDA4 | 8086 | 1593 |
| Mellanox | MT27700 Family [ConnectX‑4] | 15b3 | 1013 |
| Mellanox | MT27710 Family [ConnectX‑4 Lx] | 15b3 | 1015 |
| Mellanox | MT27800 Family [ConnectX‑5] | 15b3 | 1017 |
| Mellanox | MT28880 Family [ConnectX‑5 Ex] | 15b3 | 1019 |
| Mellanox | MT28908 Family [ConnectX‑6] | 15b3 | 101b |
| Mellanox | MT2892 Family [ConnectX‑6 Dx] | 15b3 | 101d |
| Mellanox | MT2894 Family [ConnectX‑6 Lx] | 15b3 | 101f |
| Mellanox | MT42822 BlueField‑2 in ConnectX‑6 NIC mode | 15b3 | a2d6 |
| Pensando [1] | DSC-25 dual-port 25G distributed services card for ionic driver | 0x1dd8 | 0x1002 |
| Pensando [1] | DSC-100 dual-port 100G distributed services card for ionic driver | 0x1dd8 | 0x1003 |
| Silicom | STS Family | 8086 | 1591 |
- OpenShift SR-IOV is supported, but you must set a static, Virtual Function (VF) media access control (MAC) address using the SR-IOV CNI config file when using SR-IOV.
For the most up-to-date list of supported cards and compatible OpenShift Container Platform versions available, see Openshift Single Root I/O Virtualization (SR-IOV) and PTP hardware networks Support Matrix.
22.1.1.3. Automated discovery of SR-IOV network devices Copy linkLink copied to clipboard!
The SR-IOV Network Operator searches your cluster for SR-IOV capable network devices on worker nodes. The Operator creates and updates a SriovNetworkNodeState custom resource (CR) for each worker node that provides a compatible SR-IOV network device.
The CR is assigned the same name as the worker node. The status.interfaces list provides information about the network devices on a node.
Do not modify a SriovNetworkNodeState object. The Operator creates and manages these resources automatically.
22.1.1.3.1. Example SriovNetworkNodeState object Copy linkLink copied to clipboard!
The following YAML is an example of a SriovNetworkNodeState object created by the SR-IOV Network Operator:
An SriovNetworkNodeState object
22.1.1.4. Example use of a virtual function in a pod Copy linkLink copied to clipboard!
You can run a remote direct memory access (RDMA) or a Data Plane Development Kit (DPDK) application in a pod with SR-IOV VF attached.
This example shows a pod using a virtual function (VF) in RDMA mode:
Pod spec that uses RDMA mode
The following example shows a pod with a VF in DPDK mode:
Pod spec that uses DPDK mode
22.1.1.5. DPDK library for use with container applications Copy linkLink copied to clipboard!
An optional library, app-netutil, provides several API methods for gathering network information about a pod from within a container running within that pod.
This library can assist with integrating SR-IOV virtual functions (VFs) in Data Plane Development Kit (DPDK) mode into the container. The library provides both a Golang API and a C API.
Currently there are three API methods implemented:
GetCPUInfo()- This function determines which CPUs are available to the container and returns the list.
GetHugepages()-
This function determines the amount of huge page memory requested in the
Podspec for each container and returns the values. GetInterfaces()- This function determines the set of interfaces in the container and returns the list. The return value includes the interface type and type-specific data for each interface.
The repository for the library includes a sample Dockerfile to build a container image, dpdk-app-centos. The container image can run one of the following DPDK sample applications, depending on an environment variable in the pod specification: l2fwd, l3wd or testpmd. The container image provides an example of integrating the app-netutil library into the container image itself. The library can also integrate into an init container. The init container can collect the required data and pass the data to an existing DPDK workload.
22.1.1.6. Huge pages resource injection for Downward API Copy linkLink copied to clipboard!
When a pod specification includes a resource request or limit for huge pages, the Network Resources Injector automatically adds Downward API fields to the pod specification to provide the huge pages information to the container.
The Network Resources Injector adds a volume that is named podnetinfo and is mounted at /etc/podnetinfo for each container in the pod. The volume uses the Downward API and includes a file for huge pages requests and limits. The file naming convention is as follows:
-
/etc/podnetinfo/hugepages_1G_request_<container-name> -
/etc/podnetinfo/hugepages_1G_limit_<container-name> -
/etc/podnetinfo/hugepages_2M_request_<container-name> -
/etc/podnetinfo/hugepages_2M_limit_<container-name>
The paths specified in the previous list are compatible with the app-netutil library. By default, the library is configured to search for resource information in the /etc/podnetinfo directory. If you choose to specify the Downward API path items yourself manually, the app-netutil library searches for the following paths in addition to the paths in the previous list.
-
/etc/podnetinfo/hugepages_request -
/etc/podnetinfo/hugepages_limit -
/etc/podnetinfo/hugepages_1G_request -
/etc/podnetinfo/hugepages_1G_limit -
/etc/podnetinfo/hugepages_2M_request -
/etc/podnetinfo/hugepages_2M_limit
As with the paths that the Network Resources Injector can create, the paths in the preceding list can optionally end with a _<container-name> suffix.
22.1.3. Next steps Copy linkLink copied to clipboard!
22.2. Installing the SR-IOV Network Operator Copy linkLink copied to clipboard!
You can install the Single Root I/O Virtualization (SR-IOV) Network Operator on your cluster to manage SR-IOV network devices and network attachments.
22.2.1. Installing SR-IOV Network Operator Copy linkLink copied to clipboard!
As a cluster administrator, you can install the SR-IOV Network Operator by using the OpenShift Container Platform CLI or the web console.
22.2.1.1. CLI: Installing the SR-IOV Network Operator Copy linkLink copied to clipboard!
As a cluster administrator, you can install the Operator using the CLI.
Prerequisites
- A cluster installed on bare-metal hardware with nodes that have hardware that supports SR-IOV.
-
Install the OpenShift CLI (
oc). -
An account with
cluster-adminprivileges.
Procedure
To create the
openshift-sriov-network-operatornamespace, enter the following command:Copy to Clipboard Copied! Toggle word wrap Toggle overflow To create an OperatorGroup CR, enter the following command:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Subscribe to the SR-IOV Network Operator.
Run the following command to get the OpenShift Container Platform major and minor version. It is required for the
channelvalue in the next step.OC_VERSION=$(oc version -o yaml | grep openshiftVersion | \ grep -o '[0-9]*[.][0-9]*' | head -1)$ OC_VERSION=$(oc version -o yaml | grep openshiftVersion | \ grep -o '[0-9]*[.][0-9]*' | head -1)Copy to Clipboard Copied! Toggle word wrap Toggle overflow To create a Subscription CR for the SR-IOV Network Operator, enter the following command:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
To verify that the Operator is installed, enter the following command:
oc get csv -n openshift-sriov-network-operator \ -o custom-columns=Name:.metadata.name,Phase:.status.phase
$ oc get csv -n openshift-sriov-network-operator \ -o custom-columns=Name:.metadata.name,Phase:.status.phaseCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Name Phase sriov-network-operator.4.12.0-202310121402 Succeeded
Name Phase sriov-network-operator.4.12.0-202310121402 SucceededCopy to Clipboard Copied! Toggle word wrap Toggle overflow
22.2.1.2. Web console: Installing the SR-IOV Network Operator Copy linkLink copied to clipboard!
As a cluster administrator, you can install the Operator using the web console.
Prerequisites
- A cluster installed on bare-metal hardware with nodes that have hardware that supports SR-IOV.
-
Install the OpenShift CLI (
oc). -
An account with
cluster-adminprivileges.
Procedure
Install the SR-IOV Network Operator:
- In the OpenShift Container Platform web console, click Operators → OperatorHub.
- Select SR-IOV Network Operator from the list of available Operators, and then click Install.
- On the Install Operator page, under Installed Namespace, select Operator recommended Namespace.
- Click Install.
Verify that the SR-IOV Network Operator is installed successfully:
- Navigate to the Operators → Installed Operators page.
Ensure that SR-IOV Network Operator is listed in the openshift-sriov-network-operator project with a Status of InstallSucceeded.
NoteDuring installation an Operator might display a Failed status. If the installation later succeeds with an InstallSucceeded message, you can ignore the Failed message.
If the Operator does not appear as installed, to troubleshoot further:
- Inspect the Operator Subscriptions and Install Plans tabs for any failure or errors under Status.
-
Navigate to the Workloads → Pods page and check the logs for pods in the
openshift-sriov-network-operatorproject. Check the namespace of the YAML file. If the annotation is missing, you can add the annotation
workload.openshift.io/allowed=managementto the Operator namespace with the following command:oc annotate ns/openshift-sriov-network-operator workload.openshift.io/allowed=management
$ oc annotate ns/openshift-sriov-network-operator workload.openshift.io/allowed=managementCopy to Clipboard Copied! Toggle word wrap Toggle overflow NoteFor single-node OpenShift clusters, the annotation
workload.openshift.io/allowed=managementis required for the namespace.
22.2.2. Next steps Copy linkLink copied to clipboard!
- Optional: Configuring the SR-IOV Network Operator
22.3. Configuring the SR-IOV Network Operator Copy linkLink copied to clipboard!
The Single Root I/O Virtualization (SR-IOV) Network Operator manages the SR-IOV network devices and network attachments in your cluster.
22.3.1. Configuring the SR-IOV Network Operator Copy linkLink copied to clipboard!
Modifying the SR-IOV Network Operator configuration is not normally necessary. The default configuration is recommended for most use cases. Complete the steps to modify the relevant configuration only if the default behavior of the Operator is not compatible with your use case.
The SR-IOV Network Operator adds the SriovOperatorConfig.sriovnetwork.openshift.io CustomResourceDefinition resource. The Operator automatically creates a SriovOperatorConfig custom resource (CR) named default in the openshift-sriov-network-operator namespace.
The default CR contains the SR-IOV Network Operator configuration for your cluster. To change the Operator configuration, you must modify this CR.
22.3.1.1. SR-IOV Network Operator config custom resource Copy linkLink copied to clipboard!
The fields for the sriovoperatorconfig custom resource are described in the following table:
| Field | Type | Description |
|---|---|---|
|
|
|
Specifies the name of the SR-IOV Network Operator instance. The default value is |
|
|
|
Specifies the namespace of the SR-IOV Network Operator instance. The default value is |
|
|
| Specifies the node selection to control scheduling the SR-IOV Network Config Daemon on selected nodes. By default, this field is not set and the Operator deploys the SR-IOV Network Config daemon set on worker nodes. |
|
|
|
Specifies whether to disable the node draining process or enable the node draining process when you apply a new policy to configure the NIC on a node. Setting this field to
For single-node clusters, set this field to |
|
|
|
Specifies whether to enable or disable the Network Resources Injector daemon set. By default, this field is set to |
|
|
| Specifies whether to enable or disable the Operator Admission Controller webhook daemon set. |
|
|
|
Specifies the log verbosity level of the Operator. By default, this field is set to |
22.3.1.2. About the Network Resources Injector Copy linkLink copied to clipboard!
The Network Resources Injector is a Kubernetes Dynamic Admission Controller application. It provides the following capabilities:
- Mutation of resource requests and limits in a pod specification to add an SR-IOV resource name according to an SR-IOV network attachment definition annotation.
-
Mutation of a pod specification with a Downward API volume to expose pod annotations, labels, and huge pages requests and limits. Containers that run in the pod can access the exposed information as files under the
/etc/podnetinfopath.
By default, the Network Resources Injector is enabled by the SR-IOV Network Operator and runs as a daemon set on all control plane nodes. The following is an example of Network Resources Injector pods running in a cluster with three control plane nodes:
oc get pods -n openshift-sriov-network-operator
$ oc get pods -n openshift-sriov-network-operator
Example output
NAME READY STATUS RESTARTS AGE network-resources-injector-5cz5p 1/1 Running 0 10m network-resources-injector-dwqpx 1/1 Running 0 10m network-resources-injector-lktz5 1/1 Running 0 10m
NAME READY STATUS RESTARTS AGE
network-resources-injector-5cz5p 1/1 Running 0 10m
network-resources-injector-dwqpx 1/1 Running 0 10m
network-resources-injector-lktz5 1/1 Running 0 10m
22.3.1.3. About the SR-IOV Network Operator admission controller webhook Copy linkLink copied to clipboard!
The SR-IOV Network Operator Admission Controller webhook is a Kubernetes Dynamic Admission Controller application. It provides the following capabilities:
-
Validation of the
SriovNetworkNodePolicyCR when it is created or updated. -
Mutation of the
SriovNetworkNodePolicyCR by setting the default value for thepriorityanddeviceTypefields when the CR is created or updated.
By default the SR-IOV Network Operator Admission Controller webhook is enabled by the Operator and runs as a daemon set on all control plane nodes.
Use caution when disabling the SR-IOV Network Operator Admission Controller webhook. You can disable the webhook under specific circumstances, such as troubleshooting, or if you want to use unsupported devices. For information about configuring unsupported devices, see Configuring the SR-IOV Network Operator to use an unsupported NIC.
The following is an example of the Operator Admission Controller webhook pods running in a cluster with three control plane nodes:
oc get pods -n openshift-sriov-network-operator
$ oc get pods -n openshift-sriov-network-operator
Example output
NAME READY STATUS RESTARTS AGE operator-webhook-9jkw6 1/1 Running 0 16m operator-webhook-kbr5p 1/1 Running 0 16m operator-webhook-rpfrl 1/1 Running 0 16m
NAME READY STATUS RESTARTS AGE
operator-webhook-9jkw6 1/1 Running 0 16m
operator-webhook-kbr5p 1/1 Running 0 16m
operator-webhook-rpfrl 1/1 Running 0 16m
22.3.1.4. About custom node selectors Copy linkLink copied to clipboard!
The SR-IOV Network Config daemon discovers and configures the SR-IOV network devices on cluster nodes. By default, it is deployed to all the worker nodes in the cluster. You can use node labels to specify on which nodes the SR-IOV Network Config daemon runs.
22.3.1.5. Disabling or enabling the Network Resources Injector Copy linkLink copied to clipboard!
To disable or enable the Network Resources Injector, which is enabled by default, complete the following procedure.
Prerequisites
-
Install the OpenShift CLI (
oc). -
Log in as a user with
cluster-adminprivileges. - You must have installed the SR-IOV Network Operator.
Procedure
Set the
enableInjectorfield. Replace<value>withfalseto disable the feature ortrueto enable the feature.oc patch sriovoperatorconfig default \ --type=merge -n openshift-sriov-network-operator \ --patch '{ "spec": { "enableInjector": <value> } }'$ oc patch sriovoperatorconfig default \ --type=merge -n openshift-sriov-network-operator \ --patch '{ "spec": { "enableInjector": <value> } }'Copy to Clipboard Copied! Toggle word wrap Toggle overflow TipYou can alternatively apply the following YAML to update the Operator:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
22.3.1.6. Disabling or enabling the SR-IOV Network Operator admission controller webhook Copy linkLink copied to clipboard!
To disable or enable the admission controller webhook, which is enabled by default, complete the following procedure.
Prerequisites
-
Install the OpenShift CLI (
oc). -
Log in as a user with
cluster-adminprivileges. - You must have installed the SR-IOV Network Operator.
Procedure
Set the
enableOperatorWebhookfield. Replace<value>withfalseto disable the feature ortrueto enable it:oc patch sriovoperatorconfig default --type=merge \ -n openshift-sriov-network-operator \ --patch '{ "spec": { "enableOperatorWebhook": <value> } }'$ oc patch sriovoperatorconfig default --type=merge \ -n openshift-sriov-network-operator \ --patch '{ "spec": { "enableOperatorWebhook": <value> } }'Copy to Clipboard Copied! Toggle word wrap Toggle overflow TipYou can alternatively apply the following YAML to update the Operator:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
22.3.1.7. Configuring a custom NodeSelector for the SR-IOV Network Config daemon Copy linkLink copied to clipboard!
The SR-IOV Network Config daemon discovers and configures the SR-IOV network devices on cluster nodes. By default, it is deployed to all the worker nodes in the cluster. You can use node labels to specify on which nodes the SR-IOV Network Config daemon runs.
To specify the nodes where the SR-IOV Network Config daemon is deployed, complete the following procedure.
When you update the configDaemonNodeSelector field, the SR-IOV Network Config daemon is recreated on each selected node. While the daemon is recreated, cluster users are unable to apply any new SR-IOV Network node policy or create new SR-IOV pods.
Procedure
To update the node selector for the operator, enter the following command:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Replace
<node_label>with a label to apply as in the following example:"node-role.kubernetes.io/worker": "".TipYou can alternatively apply the following YAML to update the Operator:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
22.3.1.8. Configuring the SR-IOV Network Operator for single node installations Copy linkLink copied to clipboard!
By default, the SR-IOV Network Operator drains workloads from a node before every policy change. The Operator performs this action to ensure that there no workloads using the virtual functions before the reconfiguration.
For installations on a single node, there are no other nodes to receive the workloads. As a result, the Operator must be configured not to drain the workloads from the single node.
After performing the following procedure to disable draining workloads, you must remove any workload that uses an SR-IOV network interface before you change any SR-IOV network node policy.
Prerequisites
-
Install the OpenShift CLI (
oc). -
Log in as a user with
cluster-adminprivileges. - You must have installed the SR-IOV Network Operator.
Procedure
To set the
disableDrainfield totrue, enter the following command:oc patch sriovoperatorconfig default --type=merge \ -n openshift-sriov-network-operator \ --patch '{ "spec": { "disableDrain": true } }'$ oc patch sriovoperatorconfig default --type=merge \ -n openshift-sriov-network-operator \ --patch '{ "spec": { "disableDrain": true } }'Copy to Clipboard Copied! Toggle word wrap Toggle overflow TipYou can alternatively apply the following YAML to update the Operator:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
22.3.1.9. Deploying the SR-IOV Operator for hosted control planes Copy linkLink copied to clipboard!
Hosted control planes is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.
For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.
After you configure and deploy your hosting service cluster, you can create a subscription to the SR-IOV Operator on a hosted cluster. The SR-IOV pod runs on worker machines rather than the control plane.
Prerequisites
You have configured and deployed the hosted cluster.
Procedure
Create a namespace and an Operator group:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create a subscription to the SR-IOV Operator:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
Verification
To verify that the SR-IOV Operator is ready, run the following command and view the resulting output:
oc get csv -n openshift-sriov-network-operator
$ oc get csv -n openshift-sriov-network-operatorCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
NAME DISPLAY VERSION REPLACES PHASE sriov-network-operator.4.12.0-202211021237 SR-IOV Network Operator 4.12.0-202211021237 sriov-network-operator.4.12.0-202210290517 Succeeded
NAME DISPLAY VERSION REPLACES PHASE sriov-network-operator.4.12.0-202211021237 SR-IOV Network Operator 4.12.0-202211021237 sriov-network-operator.4.12.0-202210290517 SucceededCopy to Clipboard Copied! Toggle word wrap Toggle overflow To verify that the SR-IOV pods are deployed, run the following command:
oc get pods -n openshift-sriov-network-operator
$ oc get pods -n openshift-sriov-network-operatorCopy to Clipboard Copied! Toggle word wrap Toggle overflow
22.3.2. Next steps Copy linkLink copied to clipboard!
22.4. Configuring an SR-IOV network device Copy linkLink copied to clipboard!
You can configure a Single Root I/O Virtualization (SR-IOV) device in your cluster.
22.4.1. SR-IOV network node configuration object Copy linkLink copied to clipboard!
You specify the SR-IOV network device configuration for a node by creating an SR-IOV network node policy. The API object for the policy is part of the sriovnetwork.openshift.io API group.
The following YAML describes an SR-IOV network node policy:
- 1
- The name for the custom resource object.
- 2
- The namespace where the SR-IOV Network Operator is installed.
- 3
- The resource name of the SR-IOV network device plugin. You can create multiple SR-IOV network node policies for a resource name.
When specifying a name, be sure to use the accepted syntax expression
^[a-zA-Z0-9_]+$in theresourceName. - 4
- The node selector specifies the nodes to configure. Only SR-IOV network devices on the selected nodes are configured. The SR-IOV Container Network Interface (CNI) plugin and device plugin are deployed on selected nodes only.Important
The SR-IOV Network Operator applies node network configuration policies to nodes in sequence. Before applying node network configuration policies, the SR-IOV Network Operator checks if the machine config pool (MCP) for a node is in an unhealthy state such as
DegradedorUpdating. If a node is in an unhealthy MCP, the process of applying node network configuration policies to all targeted nodes in the cluster pauses until the MCP returns to a healthy state.To avoid a node in an unhealthy MCP from blocking the application of node network configuration policies to other nodes, including nodes in other MCPs, you must create a separate node network configuration policy for each MCP.
- 5
- Optional: The priority is an integer value between
0and99. A smaller value receives higher priority. For example, a priority of10is a higher priority than99. The default value is99. - 6
- Optional: The maximum transmission unit (MTU) of the virtual function. The maximum MTU value can vary for different network interface controller (NIC) models.Important
If you want to create virtual function on the default network interface, ensure that the MTU is set to a value that matches the cluster MTU.
- 7
- Optional: Set
needVhostNettotrueto mount the/dev/vhost-netdevice in the pod. Use the mounted/dev/vhost-netdevice with Data Plane Development Kit (DPDK) to forward traffic to the kernel network stack. - 8
- The number of the virtual functions (VF) to create for the SR-IOV physical network device. For an Intel network interface controller (NIC), the number of VFs cannot be larger than the total VFs supported by the device. For a Mellanox NIC, the number of VFs cannot be larger than
127. - 9
- The NIC selector identifies the device for the Operator to configure. You do not have to specify values for all the parameters. It is recommended to identify the network device with enough precision to avoid selecting a device unintentionally.
If you specify
rootDevices, you must also specify a value forvendor,deviceID, orpfNames. If you specify bothpfNamesandrootDevicesat the same time, ensure that they refer to the same device. If you specify a value fornetFilter, then you do not need to specify any other parameter because a network ID is unique. - 10
- Optional: The vendor hexadecimal code of the SR-IOV network device. The only allowed values are
8086and15b3. - 11
- Optional: The device hexadecimal code of the SR-IOV network device. For example,
101bis the device ID for a Mellanox ConnectX-6 device. - 12
- Optional: An array of one or more physical function (PF) names for the device.
- 13
- Optional: An array of one or more PCI bus addresses for the PF of the device. Provide the address in the following format:
0000:02:00.1. - 14
- Optional: The platform-specific network filter. The only supported platform is Red Hat OpenStack Platform (RHOSP). Acceptable values use the following format:
openstack/NetworkID:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx. Replacexxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxxwith the value from the/var/config/openstack/latest/network_data.jsonmetadata file. - 15
- Optional: The driver type for the virtual functions. The only allowed values are
netdeviceandvfio-pci. The default value isnetdevice.For a Mellanox NIC to work in DPDK mode on bare metal nodes, use the
netdevicedriver type and setisRdmatotrue. - 16
- Optional: Configures whether to enable remote direct memory access (RDMA) mode. The default value is
false.If the
isRdmaparameter is set totrue, you can continue to use the RDMA-enabled VF as a normal network device. A device can be used in either mode.Set
isRdmatotrueand additionally setneedVhostNettotrueto configure a Mellanox NIC for use with Fast Datapath DPDK applications.NoteYou cannot set the
isRdmaparameter totruefor intel NICs. - 17
- Optional: The link type for the VFs. The default value is
ethfor Ethernet. Change this value to 'ib' for InfiniBand.When
linkTypeis set toib,isRdmais automatically set totrueby the SR-IOV Network Operator webhook. WhenlinkTypeis set toib,deviceTypeshould not be set tovfio-pci.Do not set linkType to 'eth' for SriovNetworkNodePolicy, because this can lead to an incorrect number of available devices reported by the device plugin.
- 18
- Optional: The NIC device mode. The only allowed values are
legacyorswitchdev.When
eSwitchModeis set tolegacy, the default SR-IOV behavior is enabled.When
eSwitchModeis set toswitchdev, hardware offloading is enabled.
22.4.1.1. SR-IOV network node configuration examples Copy linkLink copied to clipboard!
The following example describes the configuration for an InfiniBand device:
Example configuration for an InfiniBand device
The following example describes the configuration for an SR-IOV network device in a RHOSP virtual machine:
Example configuration for an SR-IOV device in a virtual machine
22.4.1.2. Virtual function (VF) partitioning for SR-IOV devices Copy linkLink copied to clipboard!
In some cases, you might want to split virtual functions (VFs) from the same physical function (PF) into multiple resource pools. For example, you might want some of the VFs to load with the default driver and the remaining VFs load with the vfio-pci driver. In such a deployment, the pfNames selector in your SriovNetworkNodePolicy custom resource (CR) can be used to specify a range of VFs for a pool using the following format: <pfname>#<first_vf>-<last_vf>.
For example, the following YAML shows the selector for an interface named netpf0 with VF 2 through 7:
pfNames: ["netpf0#2-7"]
pfNames: ["netpf0#2-7"]
-
netpf0is the PF interface name. -
2is the first VF index (0-based) that is included in the range. -
7is the last VF index (0-based) that is included in the range.
You can select VFs from the same PF by using different policy CRs if the following requirements are met:
-
The
numVfsvalue must be identical for policies that select the same PF. -
The VF index must be in the range of
0to<numVfs>-1. For example, if you have a policy withnumVfsset to8, then the<first_vf>value must not be smaller than0, and the<last_vf>must not be larger than7. - The VFs ranges in different policies must not overlap.
-
The
<first_vf>must not be larger than the<last_vf>.
The following example illustrates NIC partitioning for an SR-IOV device.
The policy policy-net-1 defines a resource pool net-1 that contains the VF 0 of PF netpf0 with the default VF driver. The policy policy-net-1-dpdk defines a resource pool net-1-dpdk that contains the VF 8 to 15 of PF netpf0 with the vfio VF driver.
Policy policy-net-1:
Policy policy-net-1-dpdk:
Verifying that the interface is successfully partitioned
Confirm that the interface partitioned to virtual functions (VFs) for the SR-IOV device by running the following command.
ip link show <interface>
$ ip link show <interface>
- 1
- Replace
<interface>with the interface that you specified when partitioning to VFs for the SR-IOV device, for example,ens3f1.
Example output
22.4.2. Configuring SR-IOV network devices Copy linkLink copied to clipboard!
The SR-IOV Network Operator adds the SriovNetworkNodePolicy.sriovnetwork.openshift.io custom resource definition (CRD) to OpenShift Container Platform. You can configure an SR-IOV network device by creating a SriovNetworkNodePolicy custom resource (CR).
When applying the configuration specified in a SriovNetworkNodePolicy object, the SR-IOV Operator might drain the nodes, and in some cases, reboot nodes.
It might take several minutes for a configuration change to apply.
Prerequisites
-
You installed the OpenShift CLI (
oc). -
You have access to the cluster as a user with the
cluster-adminrole. - You have installed the SR-IOV Network Operator.
- You have enough available nodes in your cluster to handle the evicted workload from drained nodes.
- You have not selected any control plane nodes for SR-IOV network device configuration.
Procedure
-
Create an
SriovNetworkNodePolicyobject, and then save the YAML in the<name>-sriov-node-network.yamlfile. Replace<name>with the name for this configuration. -
Optional: Label the SR-IOV capable cluster nodes with
SriovNetworkNodePolicy.Spec.NodeSelectorif they are not already labeled. For more information about labeling nodes, see "Understanding how to update labels on nodes". Create the
SriovNetworkNodePolicyobject:oc create -f <name>-sriov-node-network.yaml
$ oc create -f <name>-sriov-node-network.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow where
<name>specifies the name for this configuration.After applying the configuration update, all the pods in
sriov-network-operatornamespace transition to theRunningstatus.To verify that the SR-IOV network device is configured, enter the following command. Replace
<node_name>with the name of a node with the SR-IOV network device that you just configured.oc get sriovnetworknodestates -n openshift-sriov-network-operator <node_name> -o jsonpath='{.status.syncStatus}'$ oc get sriovnetworknodestates -n openshift-sriov-network-operator <node_name> -o jsonpath='{.status.syncStatus}'Copy to Clipboard Copied! Toggle word wrap Toggle overflow
22.4.3. Troubleshooting SR-IOV configuration Copy linkLink copied to clipboard!
After following the procedure to configure an SR-IOV network device, the following sections address some error conditions.
To display the state of nodes, run the following command:
oc get sriovnetworknodestates -n openshift-sriov-network-operator <node_name>
$ oc get sriovnetworknodestates -n openshift-sriov-network-operator <node_name>
where: <node_name> specifies the name of a node with an SR-IOV network device.
Error output: Cannot allocate memory
"lastSyncError": "write /sys/bus/pci/devices/0000:3b:00.1/sriov_numvfs: cannot allocate memory"
"lastSyncError": "write /sys/bus/pci/devices/0000:3b:00.1/sriov_numvfs: cannot allocate memory"
When a node indicates that it cannot allocate memory, check the following items:
- Confirm that global SR-IOV settings are enabled in the BIOS for the node.
- Confirm that VT-d is enabled in the BIOS for the node.
22.4.4. Assigning an SR-IOV network to a VRF Copy linkLink copied to clipboard!
As a cluster administrator, you can assign an SR-IOV network interface to your VRF domain by using the CNI VRF plugin.
To do this, add the VRF configuration to the optional metaPlugins parameter of the SriovNetwork resource.
Applications that use VRFs need to bind to a specific device. The common usage is to use the SO_BINDTODEVICE option for a socket. SO_BINDTODEVICE binds the socket to a device that is specified in the passed interface name, for example, eth1. To use SO_BINDTODEVICE, the application must have CAP_NET_RAW capabilities.
Using a VRF through the ip vrf exec command is not supported in OpenShift Container Platform pods. To use VRF, bind applications directly to the VRF interface.
22.4.4.1. Creating an additional SR-IOV network attachment with the CNI VRF plugin Copy linkLink copied to clipboard!
The SR-IOV Network Operator manages additional network definitions. When you specify an additional SR-IOV network to create, the SR-IOV Network Operator creates the NetworkAttachmentDefinition custom resource (CR) automatically.
Do not edit NetworkAttachmentDefinition custom resources that the SR-IOV Network Operator manages. Doing so might disrupt network traffic on your additional network.
To create an additional SR-IOV network attachment with the CNI VRF plugin, perform the following procedure.
Prerequisites
- Install the OpenShift Container Platform CLI (oc).
- Log in to the OpenShift Container Platform cluster as a user with cluster-admin privileges.
Procedure
Create the
SriovNetworkcustom resource (CR) for the additional SR-IOV network attachment and insert themetaPluginsconfiguration, as in the following example CR. Save the YAML as the filesriov-network-attachment.yaml.Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create the
SriovNetworkresource:oc create -f sriov-network-attachment.yaml
$ oc create -f sriov-network-attachment.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Verifying that the NetworkAttachmentDefinition CR is successfully created
Confirm that the SR-IOV Network Operator created the
NetworkAttachmentDefinitionCR by running the following command.oc get network-attachment-definitions -n <namespace>
$ oc get network-attachment-definitions -n <namespace>1 Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Replace
<namespace>with the namespace that you specified when configuring the network attachment, for example,additional-sriov-network-1.
Example output
NAME AGE additional-sriov-network-1 14m
NAME AGE additional-sriov-network-1 14mCopy to Clipboard Copied! Toggle word wrap Toggle overflow NoteThere might be a delay before the SR-IOV Network Operator creates the CR.
Verifying that the additional SR-IOV network attachment is successful
To verify that the VRF CNI is correctly configured and the additional SR-IOV network attachment is attached, do the following:
- Create an SR-IOV network that uses the VRF CNI.
- Assign the network to a pod.
Verify that the pod network attachment is connected to the SR-IOV additional network. Remote shell into the pod and run the following command:
ip vrf show
$ ip vrf showCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Name Table ----------------------- red 10
Name Table ----------------------- red 10Copy to Clipboard Copied! Toggle word wrap Toggle overflow Confirm the VRF interface is master of the secondary interface:
ip link
$ ip linkCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
... 5: net1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master red state UP mode ...
... 5: net1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master red state UP mode ...Copy to Clipboard Copied! Toggle word wrap Toggle overflow
22.4.5. Next steps Copy linkLink copied to clipboard!
22.5. Configuring an SR-IOV Ethernet network attachment Copy linkLink copied to clipboard!
You can configure an Ethernet network attachment for an Single Root I/O Virtualization (SR-IOV) device in the cluster.
22.5.1. Ethernet device configuration object Copy linkLink copied to clipboard!
You can configure an Ethernet network device by defining an SriovNetwork object.
The following YAML describes an SriovNetwork object:
- 1
- A name for the object. The SR-IOV Network Operator creates a
NetworkAttachmentDefinitionobject with same name. - 2
- The namespace where the SR-IOV Network Operator is installed.
- 3
- The value for the
spec.resourceNameparameter from theSriovNetworkNodePolicyobject that defines the SR-IOV hardware for this additional network. - 4
- The target namespace for the
SriovNetworkobject. Only pods in the target namespace can attach to the additional network. - 5
- Optional: A Virtual LAN (VLAN) ID for the additional network. The integer value must be from
0to4095. The default value is0. - 6
- Optional: The spoof check mode of the VF. The allowed values are the strings
"on"and"off".ImportantYou must enclose the value you specify in quotes or the object is rejected by the SR-IOV Network Operator.
- 7
- A configuration object for the IPAM CNI plugin as a YAML block scalar. The plugin manages IP address assignment for the attachment definition.
- 8
- Optional: The link state of virtual function (VF). Allowed value are
enable,disableandauto. - 9
- Optional: A maximum transmission rate, in Mbps, for the VF.
- 10
- Optional: A minimum transmission rate, in Mbps, for the VF. This value must be less than or equal to the maximum transmission rate.Note
Intel NICs do not support the
minTxRateparameter. For more information, see BZ#1772847. - 11
- Optional: An IEEE 802.1p priority level for the VF. The default value is
0. - 12
- Optional: The trust mode of the VF. The allowed values are the strings
"on"and"off".ImportantYou must enclose the value that you specify in quotes, or the SR-IOV Network Operator rejects the object.
- 13
- Optional: The capabilities to configure for this additional network. You can specify
"{ "ips": true }"to enable IP address support or"{ "mac": true }"to enable MAC address support.
22.5.1.1. Configuration of IP address assignment for an additional network Copy linkLink copied to clipboard!
The IP address management (IPAM) Container Network Interface (CNI) plugin provides IP addresses for other CNI plugins.
You can use the following IP address assignment types:
- Static assignment.
- Dynamic assignment through a DHCP server. The DHCP server you specify must be reachable from the additional network.
- Dynamic assignment through the Whereabouts IPAM CNI plugin.
22.5.1.1.1. Static IP address assignment configuration Copy linkLink copied to clipboard!
The following table describes the configuration for static IP address assignment:
| Field | Type | Description |
|---|---|---|
|
|
|
The IPAM address type. The value |
|
|
| An array of objects specifying IP addresses to assign to the virtual interface. Both IPv4 and IPv6 IP addresses are supported. |
|
|
| An array of objects specifying routes to configure inside the pod. |
|
|
| Optional: An array of objects specifying the DNS configuration. |
The addresses array requires objects with the following fields:
| Field | Type | Description |
|---|---|---|
|
|
|
An IP address and network prefix that you specify. For example, if you specify |
|
|
| The default gateway to route egress network traffic to. |
| Field | Type | Description |
|---|---|---|
|
|
|
The IP address range in CIDR format, such as |
|
|
| The gateway where network traffic is routed. |
| Field | Type | Description |
|---|---|---|
|
|
| An array of one or more IP addresses for to send DNS queries to. |
|
|
|
The default domain to append to a hostname. For example, if the domain is set to |
|
|
|
An array of domain names to append to an unqualified hostname, such as |
Static IP address assignment configuration example
22.5.1.1.2. Dynamic IP address (DHCP) assignment configuration Copy linkLink copied to clipboard!
The following JSON describes the configuration for dynamic IP address address assignment with DHCP.
A pod obtains its original DHCP lease when it is created. The lease must be periodically renewed by a minimal DHCP server deployment running on the cluster.
The SR-IOV Network Operator does not create a DHCP server deployment; The Cluster Network Operator is responsible for creating the minimal DHCP server deployment.
To trigger the deployment of the DHCP server, you must create a shim network attachment by editing the Cluster Network Operator configuration, as in the following example:
Example shim network attachment definition
| Field | Type | Description |
|---|---|---|
|
|
|
The IPAM address type. The value |
Dynamic IP address (DHCP) assignment configuration example
{
"ipam": {
"type": "dhcp"
}
}
{
"ipam": {
"type": "dhcp"
}
}
22.5.1.1.3. Dynamic IP address assignment configuration with Whereabouts Copy linkLink copied to clipboard!
The Whereabouts CNI plugin allows the dynamic assignment of an IP address to an additional network without the use of a DHCP server.
The following table describes the configuration for dynamic IP address assignment with Whereabouts:
| Field | Type | Description |
|---|---|---|
|
|
|
The IPAM address type. The value |
|
|
| An IP address and range in CIDR notation. IP addresses are assigned from within this range of addresses. |
|
|
| Optional: A list of zero or more IP addresses and ranges in CIDR notation. IP addresses within an excluded address range are not assigned. |
Dynamic IP address assignment configuration example that uses Whereabouts
22.5.1.1.4. Creating a Whereabouts reconciler daemon set Copy linkLink copied to clipboard!
The Whereabouts reconciler is responsible for managing dynamic IP address assignments for the pods within a cluster using the Whereabouts IP Address Management (IPAM) solution. It ensures that each pods gets a unique IP address from the specified IP address range. It also handles IP address releases when pods are deleted or scaled down.
You can also use a NetworkAttachmentDefinition custom resource for dynamic IP address assignment.
The Whereabouts reconciler daemon set is automatically created when you configure an additional network through the Cluster Network Operator. It is not automatically created when you configure an additional network from a YAML manifest.
To trigger the deployment of the Whereabouts reconciler daemonset, you must manually create a whereabouts-shim network attachment by editing the Cluster Network Operator custom resource file.
Use the following procedure to deploy the Whereabouts reconciler daemonset.
Procedure
Edit the
Network.operator.openshift.iocustom resource (CR) by running the following command:oc edit network.operator.openshift.io cluster
$ oc edit network.operator.openshift.io clusterCopy to Clipboard Copied! Toggle word wrap Toggle overflow Modify the
additionalNetworksparameter in the CR to add thewhereabouts-shimnetwork attachment definition. For example:Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Save the file and exit the text editor.
Verify that the
whereabouts-reconcilerdaemon set deployed successfully by running the following command:oc get all -n openshift-multus | grep whereabouts-reconciler
$ oc get all -n openshift-multus | grep whereabouts-reconcilerCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
22.5.2. Configuring SR-IOV additional network Copy linkLink copied to clipboard!
You can configure an additional network that uses SR-IOV hardware by creating an SriovNetwork object. When you create an SriovNetwork object, the SR-IOV Network Operator automatically creates a NetworkAttachmentDefinition object.
Do not modify or delete an SriovNetwork object if it is attached to any pods in a running state.
Prerequisites
-
Install the OpenShift CLI (
oc). -
Log in as a user with
cluster-adminprivileges.
Procedure
Create a
SriovNetworkobject, and then save the YAML in the<name>.yamlfile, where<name>is a name for this additional network. The object specification might resemble the following example:Copy to Clipboard Copied! Toggle word wrap Toggle overflow To create the object, enter the following command:
oc create -f <name>.yaml
$ oc create -f <name>.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow where
<name>specifies the name of the additional network.Optional: To confirm that the
NetworkAttachmentDefinitionobject that is associated with theSriovNetworkobject that you created in the previous step exists, enter the following command. Replace<namespace>with thenetworkNamespacevalue you specified in theSriovNetworkobject.oc get net-attach-def -n <namespace>
$ oc get net-attach-def -n <namespace>Copy to Clipboard Copied! Toggle word wrap Toggle overflow
22.5.3. Next steps Copy linkLink copied to clipboard!
22.6. Configuring an SR-IOV InfiniBand network attachment Copy linkLink copied to clipboard!
You can configure an InfiniBand (IB) network attachment for an Single Root I/O Virtualization (SR-IOV) device in the cluster.
22.6.1. InfiniBand device configuration object Copy linkLink copied to clipboard!
You can configure an InfiniBand (IB) network device by defining an SriovIBNetwork object.
The following YAML describes an SriovIBNetwork object:
- 1
- A name for the object. The SR-IOV Network Operator creates a
NetworkAttachmentDefinitionobject with same name. - 2
- The namespace where the SR-IOV Operator is installed.
- 3
- The value for the
spec.resourceNameparameter from theSriovNetworkNodePolicyobject that defines the SR-IOV hardware for this additional network. - 4
- The target namespace for the
SriovIBNetworkobject. Only pods in the target namespace can attach to the network device. - 5
- Optional: A configuration object for the IPAM CNI plugin as a YAML block scalar. The plugin manages IP address assignment for the attachment definition.
- 6
- Optional: The link state of virtual function (VF). Allowed values are
enable,disableandauto. - 7
- Optional: The capabilities to configure for this network. You can specify
"{ "ips": true }"to enable IP address support or"{ "infinibandGUID": true }"to enable IB Global Unique Identifier (GUID) support.
22.6.1.1. Configuration of IP address assignment for an additional network Copy linkLink copied to clipboard!
The IP address management (IPAM) Container Network Interface (CNI) plugin provides IP addresses for other CNI plugins.
You can use the following IP address assignment types:
- Static assignment.
- Dynamic assignment through a DHCP server. The DHCP server you specify must be reachable from the additional network.
- Dynamic assignment through the Whereabouts IPAM CNI plugin.
22.6.1.1.1. Static IP address assignment configuration Copy linkLink copied to clipboard!
The following table describes the configuration for static IP address assignment:
| Field | Type | Description |
|---|---|---|
|
|
|
The IPAM address type. The value |
|
|
| An array of objects specifying IP addresses to assign to the virtual interface. Both IPv4 and IPv6 IP addresses are supported. |
|
|
| An array of objects specifying routes to configure inside the pod. |
|
|
| Optional: An array of objects specifying the DNS configuration. |
The addresses array requires objects with the following fields:
| Field | Type | Description |
|---|---|---|
|
|
|
An IP address and network prefix that you specify. For example, if you specify |
|
|
| The default gateway to route egress network traffic to. |
| Field | Type | Description |
|---|---|---|
|
|
|
The IP address range in CIDR format, such as |
|
|
| The gateway where network traffic is routed. |
| Field | Type | Description |
|---|---|---|
|
|
| An array of one or more IP addresses for to send DNS queries to. |
|
|
|
The default domain to append to a hostname. For example, if the domain is set to |
|
|
|
An array of domain names to append to an unqualified hostname, such as |
Static IP address assignment configuration example
22.6.1.1.2. Dynamic IP address (DHCP) assignment configuration Copy linkLink copied to clipboard!
The following JSON describes the configuration for dynamic IP address address assignment with DHCP.
A pod obtains its original DHCP lease when it is created. The lease must be periodically renewed by a minimal DHCP server deployment running on the cluster.
To trigger the deployment of the DHCP server, you must create a shim network attachment by editing the Cluster Network Operator configuration, as in the following example:
Example shim network attachment definition
| Field | Type | Description |
|---|---|---|
|
|
|
The IPAM address type. The value |
Dynamic IP address (DHCP) assignment configuration example
{
"ipam": {
"type": "dhcp"
}
}
{
"ipam": {
"type": "dhcp"
}
}
22.6.1.1.3. Dynamic IP address assignment configuration with Whereabouts Copy linkLink copied to clipboard!
The Whereabouts CNI plugin allows the dynamic assignment of an IP address to an additional network without the use of a DHCP server.
The following table describes the configuration for dynamic IP address assignment with Whereabouts:
| Field | Type | Description |
|---|---|---|
|
|
|
The IPAM address type. The value |
|
|
| An IP address and range in CIDR notation. IP addresses are assigned from within this range of addresses. |
|
|
| Optional: A list of zero or more IP addresses and ranges in CIDR notation. IP addresses within an excluded address range are not assigned. |
Dynamic IP address assignment configuration example that uses Whereabouts
22.6.1.1.4. Creating a Whereabouts reconciler daemon set Copy linkLink copied to clipboard!
The Whereabouts reconciler is responsible for managing dynamic IP address assignments for the pods within a cluster using the Whereabouts IP Address Management (IPAM) solution. It ensures that each pods gets a unique IP address from the specified IP address range. It also handles IP address releases when pods are deleted or scaled down.
You can also use a NetworkAttachmentDefinition custom resource for dynamic IP address assignment.
The Whereabouts reconciler daemon set is automatically created when you configure an additional network through the Cluster Network Operator. It is not automatically created when you configure an additional network from a YAML manifest.
To trigger the deployment of the Whereabouts reconciler daemonset, you must manually create a whereabouts-shim network attachment by editing the Cluster Network Operator custom resource file.
Use the following procedure to deploy the Whereabouts reconciler daemonset.
Procedure
Edit the
Network.operator.openshift.iocustom resource (CR) by running the following command:oc edit network.operator.openshift.io cluster
$ oc edit network.operator.openshift.io clusterCopy to Clipboard Copied! Toggle word wrap Toggle overflow Modify the
additionalNetworksparameter in the CR to add thewhereabouts-shimnetwork attachment definition. For example:Copy to Clipboard Copied! Toggle word wrap Toggle overflow - Save the file and exit the text editor.
Verify that the
whereabouts-reconcilerdaemon set deployed successfully by running the following command:oc get all -n openshift-multus | grep whereabouts-reconciler
$ oc get all -n openshift-multus | grep whereabouts-reconcilerCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
22.6.2. Configuring SR-IOV additional network Copy linkLink copied to clipboard!
You can configure an additional network that uses SR-IOV hardware by creating an SriovIBNetwork object. When you create an SriovIBNetwork object, the SR-IOV Network Operator automatically creates a NetworkAttachmentDefinition object.
Do not modify or delete an SriovIBNetwork object if it is attached to any pods in a running state.
Prerequisites
-
Install the OpenShift CLI (
oc). -
Log in as a user with
cluster-adminprivileges.
Procedure
Create a
SriovIBNetworkobject, and then save the YAML in the<name>.yamlfile, where<name>is a name for this additional network. The object specification might resemble the following example:Copy to Clipboard Copied! Toggle word wrap Toggle overflow To create the object, enter the following command:
oc create -f <name>.yaml
$ oc create -f <name>.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow where
<name>specifies the name of the additional network.Optional: To confirm that the
NetworkAttachmentDefinitionobject that is associated with theSriovIBNetworkobject that you created in the previous step exists, enter the following command. Replace<namespace>with thenetworkNamespacevalue you specified in theSriovIBNetworkobject.oc get net-attach-def -n <namespace>
$ oc get net-attach-def -n <namespace>Copy to Clipboard Copied! Toggle word wrap Toggle overflow
22.6.3. Next steps Copy linkLink copied to clipboard!
22.7. Adding a pod to an SR-IOV additional network Copy linkLink copied to clipboard!
You can add a pod to an existing Single Root I/O Virtualization (SR-IOV) network.
22.7.1. Runtime configuration for a network attachment Copy linkLink copied to clipboard!
When attaching a pod to an additional network, you can specify a runtime configuration to make specific customizations for the pod. For example, you can request a specific MAC hardware address.
You specify the runtime configuration by setting an annotation in the pod specification. The annotation key is k8s.v1.cni.cncf.io/networks, and it accepts a JSON object that describes the runtime configuration.
22.7.1.1. Runtime configuration for an Ethernet-based SR-IOV attachment Copy linkLink copied to clipboard!
The following JSON describes the runtime configuration options for an Ethernet-based SR-IOV network attachment.
- 1
- The name of the SR-IOV network attachment definition CR.
- 2
- Optional: The MAC address for the SR-IOV device that is allocated from the resource type defined in the SR-IOV network attachment definition CR. To use this feature, you also must specify
{ "mac": true }in theSriovNetworkobject. - 3
- Optional: IP addresses for the SR-IOV device that is allocated from the resource type defined in the SR-IOV network attachment definition CR. Both IPv4 and IPv6 addresses are supported. To use this feature, you also must specify
{ "ips": true }in theSriovNetworkobject.
Example runtime configuration
22.7.1.2. Runtime configuration for an InfiniBand-based SR-IOV attachment Copy linkLink copied to clipboard!
The following JSON describes the runtime configuration options for an InfiniBand-based SR-IOV network attachment.
- 1
- The name of the SR-IOV network attachment definition CR.
- 2
- The InfiniBand GUID for the SR-IOV device. To use this feature, you also must specify
{ "infinibandGUID": true }in theSriovIBNetworkobject. - 3
- The IP addresses for the SR-IOV device that is allocated from the resource type defined in the SR-IOV network attachment definition CR. Both IPv4 and IPv6 addresses are supported. To use this feature, you also must specify
{ "ips": true }in theSriovIBNetworkobject.
Example runtime configuration
22.7.2. Adding a pod to an additional network Copy linkLink copied to clipboard!
You can add a pod to an additional network. The pod continues to send normal cluster-related network traffic over the default network.
When a pod is created additional networks are attached to it. However, if a pod already exists, you cannot attach additional networks to it.
The pod must be in the same namespace as the additional network.
The SR-IOV Network Resource Injector adds the resource field to the first container in a pod automatically.
If you are using an Intel network interface controller (NIC) in Data Plane Development Kit (DPDK) mode, only the first container in your pod is configured to access the NIC. Your SR-IOV additional network is configured for DPDK mode if the deviceType is set to vfio-pci in the SriovNetworkNodePolicy object.
You can work around this issue by either ensuring that the container that needs access to the NIC is the first container defined in the Pod object or by disabling the Network Resource Injector. For more information, see BZ#1990953.
Prerequisites
-
Install the OpenShift CLI (
oc). - Log in to the cluster.
- Install the SR-IOV Operator.
-
Create either an
SriovNetworkobject or anSriovIBNetworkobject to attach the pod to.
Procedure
Add an annotation to the
Podobject. Only one of the following annotation formats can be used:To attach an additional network without any customization, add an annotation with the following format. Replace
<network>with the name of the additional network to associate with the pod:metadata: annotations: k8s.v1.cni.cncf.io/networks: <network>[,<network>,...]metadata: annotations: k8s.v1.cni.cncf.io/networks: <network>[,<network>,...]1 Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- To specify more than one additional network, separate each network with a comma. Do not include whitespace between the comma. If you specify the same additional network multiple times, that pod will have multiple network interfaces attached to that network.
To attach an additional network with customizations, add an annotation with the following format:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
To create the pod, enter the following command. Replace
<name>with the name of the pod.oc create -f <name>.yaml
$ oc create -f <name>.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Optional: To Confirm that the annotation exists in the
PodCR, enter the following command, replacing<name>with the name of the pod.oc get pod <name> -o yaml
$ oc get pod <name> -o yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow In the following example, the
example-podpod is attached to thenet1additional network:Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- The
k8s.v1.cni.cncf.io/networks-statusparameter is a JSON array of objects. Each object describes the status of an additional network attached to the pod. The annotation value is stored as a plain text value.
22.7.3. Creating a non-uniform memory access (NUMA) aligned SR-IOV pod Copy linkLink copied to clipboard!
You can create a NUMA aligned SR-IOV pod by restricting SR-IOV and the CPU resources allocated from the same NUMA node with restricted or single-numa-node Topology Manager polices.
Prerequisites
-
You have installed the OpenShift CLI (
oc). -
You have configured the CPU Manager policy to
static. For more information on CPU Manager, see the "Additional resources" section. You have configured the Topology Manager policy to
single-numa-node.NoteWhen
single-numa-nodeis unable to satisfy the request, you can configure the Topology Manager policy torestricted.
Procedure
Create the following SR-IOV pod spec, and then save the YAML in the
<name>-sriov-pod.yamlfile. Replace<name>with a name for this pod.The following example shows an SR-IOV pod spec:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Replace
<name>with the name of the SR-IOV network attachment definition CR. - 2
- Replace
<image>with the name of thesample-podimage. - 3
- To create the SR-IOV pod with guaranteed QoS, set
memory limitsequal tomemory requests. - 4
- To create the SR-IOV pod with guaranteed QoS, set
cpu limitsequals tocpu requests.
Create the sample SR-IOV pod by running the following command:
oc create -f <filename>
$ oc create -f <filename>1 Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Replace
<filename>with the name of the file you created in the previous step.
Confirm that the
sample-podis configured with guaranteed QoS.oc describe pod sample-pod
$ oc describe pod sample-podCopy to Clipboard Copied! Toggle word wrap Toggle overflow Confirm that the
sample-podis allocated with exclusive CPUs.oc exec sample-pod -- cat /sys/fs/cgroup/cpuset/cpuset.cpus
$ oc exec sample-pod -- cat /sys/fs/cgroup/cpuset/cpuset.cpusCopy to Clipboard Copied! Toggle word wrap Toggle overflow Confirm that the SR-IOV device and CPUs that are allocated for the
sample-podare on the same NUMA node.oc exec sample-pod -- cat /sys/fs/cgroup/cpuset/cpuset.cpus
$ oc exec sample-pod -- cat /sys/fs/cgroup/cpuset/cpuset.cpusCopy to Clipboard Copied! Toggle word wrap Toggle overflow
22.7.4. A test pod template for clusters that use SR-IOV on OpenStack Copy linkLink copied to clipboard!
The following testpmd pod demonstrates container creation with huge pages, reserved CPUs, and the SR-IOV port.
An example testpmd pod
- 1
- This example assumes that the name of the performance profile is
cnf-performance profile.
22.8. Configuring interface-level network sysctl settings for SR-IOV networks Copy linkLink copied to clipboard!
As a cluster administrator, you can modify interface-level network sysctls using the tuning Container Network Interface (CNI) meta plugin for a pod connected to a SR-IOV network device.
22.8.1. Labeling nodes with an SR-IOV enabled NIC Copy linkLink copied to clipboard!
If you want to enable SR-IOV on only SR-IOV capable nodes there are a couple of ways to do this:
-
Install the Node Feature Discovery (NFD) Operator. NFD detects the presence of SR-IOV enabled NICs and labels the nodes with
node.alpha.kubernetes-incubator.io/nfd-network-sriov.capable = true. Examine the
SriovNetworkNodeStateCR for each node. Theinterfacesstanza includes a list of all of the SR-IOV devices discovered by the SR-IOV Network Operator on the worker node. Label each node withfeature.node.kubernetes.io/network-sriov.capable: "true"by using the following command:$ oc label node <node_name> feature.node.kubernetes.io/network-sriov.capable="true"
$ oc label node <node_name> feature.node.kubernetes.io/network-sriov.capable="true"Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteYou can label the nodes with whatever name you want.
22.8.2. Setting one sysctl flag Copy linkLink copied to clipboard!
You can set interface-level network sysctl settings for a pod connected to a SR-IOV network device.
In this example, net.ipv4.conf.IFNAME.accept_redirects is set to 1 on the created virtual interfaces.
The sysctl-tuning-test is a namespace used in this example.
Use the following command to create the
sysctl-tuning-testnamespace:oc create namespace sysctl-tuning-test
$ oc create namespace sysctl-tuning-testCopy to Clipboard Copied! Toggle word wrap Toggle overflow
22.8.2.1. Setting one sysctl flag on nodes with SR-IOV network devices Copy linkLink copied to clipboard!
The SR-IOV Network Operator adds the SriovNetworkNodePolicy.sriovnetwork.openshift.io custom resource definition (CRD) to OpenShift Container Platform. You can configure an SR-IOV network device by creating a SriovNetworkNodePolicy custom resource (CR).
When applying the configuration specified in a SriovNetworkNodePolicy object, the SR-IOV Operator might drain and reboot the nodes.
It can take several minutes for a configuration change to apply.
Follow this procedure to create a SriovNetworkNodePolicy custom resource (CR).
Procedure
Create an
SriovNetworkNodePolicycustom resource (CR). For example, save the following YAML as the filepolicyoneflag-sriov-node-network.yaml:Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- The name for the custom resource object.
- 2
- The namespace where the SR-IOV Network Operator is installed.
- 3
- The resource name of the SR-IOV network device plugin. You can create multiple SR-IOV network node policies for a resource name.
- 4
- The node selector specifies the nodes to configure. Only SR-IOV network devices on the selected nodes are configured. The SR-IOV Container Network Interface (CNI) plugin and device plugin are deployed on selected nodes only.
- 5
- Optional: The priority is an integer value between
0and99. A smaller value receives higher priority. For example, a priority of10is a higher priority than99. The default value is99. - 6
- The number of the virtual functions (VFs) to create for the SR-IOV physical network device. For an Intel network interface controller (NIC), the number of VFs cannot be larger than the total VFs supported by the device. For a Mellanox NIC, the number of VFs cannot be larger than
127. - 7
- The NIC selector identifies the device for the Operator to configure. You do not have to specify values for all the parameters. It is recommended to identify the network device with enough precision to avoid selecting a device unintentionally. If you specify
rootDevices, you must also specify a value forvendor,deviceID, orpfNames. If you specify bothpfNamesandrootDevicesat the same time, ensure that they refer to the same device. If you specify a value fornetFilter, then you do not need to specify any other parameter because a network ID is unique. - 8
- Optional: An array of one or more physical function (PF) names for the device.
- 9
- Optional: The driver type for the virtual functions. The only allowed value is
netdevice. For a Mellanox NIC to work in DPDK mode on bare metal nodes, setisRdmatotrue. - 10
- Optional: Configures whether to enable remote direct memory access (RDMA) mode. The default value is
false. If theisRdmaparameter is set totrue, you can continue to use the RDMA-enabled VF as a normal network device. A device can be used in either mode. SetisRdmatotrueand additionally setneedVhostNettotrueto configure a Mellanox NIC for use with Fast Datapath DPDK applications.
NoteThe
vfio-pcidriver type is not supported.Create the
SriovNetworkNodePolicyobject:oc create -f policyoneflag-sriov-node-network.yaml
$ oc create -f policyoneflag-sriov-node-network.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow After applying the configuration update, all the pods in
sriov-network-operatornamespace change to theRunningstatus.To verify that the SR-IOV network device is configured, enter the following command. Replace
<node_name>with the name of a node with the SR-IOV network device that you just configured.oc get sriovnetworknodestates -n openshift-sriov-network-operator <node_name> -o jsonpath='{.status.syncStatus}'$ oc get sriovnetworknodestates -n openshift-sriov-network-operator <node_name> -o jsonpath='{.status.syncStatus}'Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Succeeded
SucceededCopy to Clipboard Copied! Toggle word wrap Toggle overflow
22.8.2.2. Configuring sysctl on a SR-IOV network Copy linkLink copied to clipboard!
You can set interface specific sysctl settings on virtual interfaces created by SR-IOV by adding the tuning configuration to the optional metaPlugins parameter of the SriovNetwork resource.
The SR-IOV Network Operator manages additional network definitions. When you specify an additional SR-IOV network to create, the SR-IOV Network Operator creates the NetworkAttachmentDefinition custom resource (CR) automatically.
Do not edit NetworkAttachmentDefinition custom resources that the SR-IOV Network Operator manages. Doing so might disrupt network traffic on your additional network.
To change the interface-level network net.ipv4.conf.IFNAME.accept_redirects sysctl settings, create an additional SR-IOV network with the Container Network Interface (CNI) tuning plugin.
Prerequisites
- Install the OpenShift Container Platform CLI (oc).
- Log in to the OpenShift Container Platform cluster as a user with cluster-admin privileges.
Procedure
Create the
SriovNetworkcustom resource (CR) for the additional SR-IOV network attachment and insert themetaPluginsconfiguration, as in the following example CR. Save the YAML as the filesriov-network-interface-sysctl.yaml.Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- A name for the object. The SR-IOV Network Operator creates a NetworkAttachmentDefinition object with same name.
- 2
- The namespace where the SR-IOV Network Operator is installed.
- 3
- The value for the
spec.resourceNameparameter from theSriovNetworkNodePolicyobject that defines the SR-IOV hardware for this additional network. - 4
- The target namespace for the
SriovNetworkobject. Only pods in the target namespace can attach to the additional network. - 5
- A configuration object for the IPAM CNI plugin as a YAML block scalar. The plugin manages IP address assignment for the attachment definition.
- 6
- Optional: Set capabilities for the additional network. You can specify
"{ "ips": true }"to enable IP address support or"{ "mac": true }"to enable MAC address support. - 7
- Optional: The metaPlugins parameter is used to add additional capabilities to the device. In this use case set the
typefield totuning. Specify the interface-level networksysctlyou want to set in thesysctlfield.
Create the
SriovNetworkresource:oc create -f sriov-network-interface-sysctl.yaml
$ oc create -f sriov-network-interface-sysctl.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Verifying that the NetworkAttachmentDefinition CR is successfully created
Confirm that the SR-IOV Network Operator created the
NetworkAttachmentDefinitionCR by running the following command:oc get network-attachment-definitions -n <namespace>
$ oc get network-attachment-definitions -n <namespace>1 Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Replace
<namespace>with the value fornetworkNamespacethat you specified in theSriovNetworkobject. For example,sysctl-tuning-test.
Example output
NAME AGE onevalidflag 14m
NAME AGE onevalidflag 14mCopy to Clipboard Copied! Toggle word wrap Toggle overflow NoteThere might be a delay before the SR-IOV Network Operator creates the CR.
Verifying that the additional SR-IOV network attachment is successful
To verify that the tuning CNI is correctly configured and the additional SR-IOV network attachment is attached, do the following:
Create a
PodCR. Save the following YAML as the fileexamplepod.yaml:Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- The name of the SR-IOV network attachment definition CR.
- 2
- Optional: The MAC address for the SR-IOV device that is allocated from the resource type defined in the SR-IOV network attachment definition CR. To use this feature, you also must specify
{ "mac": true }in the SriovNetwork object. - 3
- Optional: IP addresses for the SR-IOV device that are allocated from the resource type defined in the SR-IOV network attachment definition CR. Both IPv4 and IPv6 addresses are supported. To use this feature, you also must specify
{ "ips": true }in theSriovNetworkobject.
Create the
PodCR:oc apply -f examplepod.yaml
$ oc apply -f examplepod.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Verify that the pod is created by running the following command:
oc get pod -n sysctl-tuning-test
$ oc get pod -n sysctl-tuning-testCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
NAME READY STATUS RESTARTS AGE tunepod 1/1 Running 0 47s
NAME READY STATUS RESTARTS AGE tunepod 1/1 Running 0 47sCopy to Clipboard Copied! Toggle word wrap Toggle overflow Log in to the pod by running the following command:
oc rsh -n sysctl-tuning-test tunepod
$ oc rsh -n sysctl-tuning-test tunepodCopy to Clipboard Copied! Toggle word wrap Toggle overflow Verify the values of the configured sysctl flag. Find the value
net.ipv4.conf.IFNAME.accept_redirectsby running the following command::sysctl net.ipv4.conf.net1.accept_redirects
$ sysctl net.ipv4.conf.net1.accept_redirectsCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
net.ipv4.conf.net1.accept_redirects = 1
net.ipv4.conf.net1.accept_redirects = 1Copy to Clipboard Copied! Toggle word wrap Toggle overflow
22.8.3. Configuring sysctl settings for pods associated with bonded SR-IOV interface flag Copy linkLink copied to clipboard!
You can set interface-level network sysctl settings for a pod connected to a bonded SR-IOV network device.
In this example, the specific network interface-level sysctl settings that can be configured are set on the bonded interface.
The sysctl-tuning-test is a namespace used in this example.
Use the following command to create the
sysctl-tuning-testnamespace:oc create namespace sysctl-tuning-test
$ oc create namespace sysctl-tuning-testCopy to Clipboard Copied! Toggle word wrap Toggle overflow
22.8.3.1. Setting all sysctl flag on nodes with bonded SR-IOV network devices Copy linkLink copied to clipboard!
The SR-IOV Network Operator adds the SriovNetworkNodePolicy.sriovnetwork.openshift.io custom resource definition (CRD) to OpenShift Container Platform. You can configure an SR-IOV network device by creating a SriovNetworkNodePolicy custom resource (CR).
When applying the configuration specified in a SriovNetworkNodePolicy object, the SR-IOV Operator might drain the nodes, and in some cases, reboot nodes.
It might take several minutes for a configuration change to apply.
Follow this procedure to create a SriovNetworkNodePolicy custom resource (CR).
Procedure
Create an
SriovNetworkNodePolicycustom resource (CR). Save the following YAML as the filepolicyallflags-sriov-node-network.yaml. Replacepolicyallflagswith the name for the configuration.Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- The name for the custom resource object.
- 2
- The namespace where the SR-IOV Network Operator is installed.
- 3
- The resource name of the SR-IOV network device plugin. You can create multiple SR-IOV network node policies for a resource name.
- 4
- The node selector specifies the nodes to configure. Only SR-IOV network devices on the selected nodes are configured. The SR-IOV Container Network Interface (CNI) plugin and device plugin are deployed on selected nodes only.
- 5
- Optional: The priority is an integer value between
0and99. A smaller value receives higher priority. For example, a priority of10is a higher priority than99. The default value is99. - 6
- The number of virtual functions (VFs) to create for the SR-IOV physical network device. For an Intel network interface controller (NIC), the number of VFs cannot be larger than the total VFs supported by the device. For a Mellanox NIC, the number of VFs cannot be larger than
127. - 7
- The NIC selector identifies the device for the Operator to configure. You do not have to specify values for all the parameters. It is recommended to identify the network device with enough precision to avoid selecting a device unintentionally. If you specify
rootDevices, you must also specify a value forvendor,deviceID, orpfNames. If you specify bothpfNamesandrootDevicesat the same time, ensure that they refer to the same device. If you specify a value fornetFilter, then you do not need to specify any other parameter because a network ID is unique. - 8
- Optional: An array of one or more physical function (PF) names for the device.
- 9
- Optional: The driver type for the virtual functions. The only allowed value is
netdevice. For a Mellanox NIC to work in DPDK mode on bare metal nodes, setisRdmatotrue. - 10
- Optional: Configures whether to enable remote direct memory access (RDMA) mode. The default value is
false. If theisRdmaparameter is set totrue, you can continue to use the RDMA-enabled VF as a normal network device. A device can be used in either mode. SetisRdmatotrueand additionally setneedVhostNettotrueto configure a Mellanox NIC for use with Fast Datapath DPDK applications.
NoteThe
vfio-pcidriver type is not supported.Create the SriovNetworkNodePolicy object:
oc create -f policyallflags-sriov-node-network.yaml
$ oc create -f policyallflags-sriov-node-network.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow After applying the configuration update, all the pods in sriov-network-operator namespace change to the
Runningstatus.To verify that the SR-IOV network device is configured, enter the following command. Replace
<node_name>with the name of a node with the SR-IOV network device that you just configured.oc get sriovnetworknodestates -n openshift-sriov-network-operator <node_name> -o jsonpath='{.status.syncStatus}'$ oc get sriovnetworknodestates -n openshift-sriov-network-operator <node_name> -o jsonpath='{.status.syncStatus}'Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Succeeded
SucceededCopy to Clipboard Copied! Toggle word wrap Toggle overflow
22.8.3.2. Configuring sysctl on a bonded SR-IOV network Copy linkLink copied to clipboard!
You can set interface specific sysctl settings on a bonded interface created from two SR-IOV interfaces. Do this by adding the tuning configuration to the optional Plugins parameter of the bond network attachment definition.
Do not edit NetworkAttachmentDefinition custom resources that the SR-IOV Network Operator manages. Doing so might disrupt network traffic on your additional network.
To change specific interface-level network sysctl settings create the SriovNetwork custom resource (CR) with the Container Network Interface (CNI) tuning plugin by using the following procedure.
Prerequisites
- Install the OpenShift Container Platform CLI (oc).
- Log in to the OpenShift Container Platform cluster as a user with cluster-admin privileges.
Procedure
Create the
SriovNetworkcustom resource (CR) for the bonded interface as in the following example CR. Save the YAML as the filesriov-network-attachment.yaml.Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- A name for the object. The SR-IOV Network Operator creates a NetworkAttachmentDefinition object with same name.
- 2
- The namespace where the SR-IOV Network Operator is installed.
- 3
- The value for the
spec.resourceNameparameter from theSriovNetworkNodePolicyobject that defines the SR-IOV hardware for this additional network. - 4
- The target namespace for the
SriovNetworkobject. Only pods in the target namespace can attach to the additional network. - 5
- Optional: The capabilities to configure for this additional network. You can specify
"{ "ips": true }"to enable IP address support or"{ "mac": true }"to enable MAC address support.
Create the
SriovNetworkresource:oc create -f sriov-network-attachment.yaml
$ oc create -f sriov-network-attachment.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Create a bond network attachment definition as in the following example CR. Save the YAML as the file
sriov-bond-network-interface.yaml.Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- The type is
bond. - 2
- The
modeattribute specifies the bonding mode. The bonding modes supported are:-
balance-rr- 0 -
active-backup- 1 balance-xor- 2For
balance-rrorbalance-xormodes, you must set thetrustmode toonfor the SR-IOV virtual function.
-
- 3
- The
failoverattribute is mandatory for active-backup mode. - 4
- The
linksInContainer=trueflag informs the Bond CNI that the required interfaces are to be found inside the container. By default, Bond CNI looks for these interfaces on the host which does not work for integration with SRIOV and Multus. - 5
- The
linkssection defines which interfaces will be used to create the bond. By default, Multus names the attached interfaces as: "net", plus a consecutive number, starting with one. - 6
- A configuration object for the IPAM CNI plugin as a YAML block scalar. The plugin manages IP address assignment for the attachment definition. In this pod example IP addresses are configured manually, so in this case,
ipamis set to static. - 7
- Add additional capabilities to the device. For example, set the
typefield totuning. Specify the interface-level networksysctlyou want to set in the sysctl field. This example sets all interface-level networksysctlsettings that can be set.
Create the bond network attachment resource:
oc create -f sriov-bond-network-interface.yaml
$ oc create -f sriov-bond-network-interface.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Verifying that the NetworkAttachmentDefinition CR is successfully created
Confirm that the SR-IOV Network Operator created the
NetworkAttachmentDefinitionCR by running the following command:oc get network-attachment-definitions -n <namespace>
$ oc get network-attachment-definitions -n <namespace>1 Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Replace
<namespace>with the networkNamespace that you specified when configuring the network attachment, for example,sysctl-tuning-test.
Example output
NAME AGE bond-sysctl-network 22m allvalidflags 47m
NAME AGE bond-sysctl-network 22m allvalidflags 47mCopy to Clipboard Copied! Toggle word wrap Toggle overflow NoteThere might be a delay before the SR-IOV Network Operator creates the CR.
Verifying that the additional SR-IOV network resource is successful
To verify that the tuning CNI is correctly configured and the additional SR-IOV network attachment is attached, do the following:
Create a
PodCR. For example, save the following YAML as the fileexamplepod.yaml:Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- The name of the SR-IOV network attachment definition CR.
- 2
- Optional: The MAC address for the SR-IOV device that is allocated from the resource type defined in the SR-IOV network attachment definition CR. To use this feature, you also must specify
{ "mac": true }in the SriovNetwork object. - 3
- Optional: IP addresses for the SR-IOV device that are allocated from the resource type defined in the SR-IOV network attachment definition CR. Both IPv4 and IPv6 addresses are supported. To use this feature, you also must specify
{ "ips": true }in theSriovNetworkobject.
Apply the YAML:
oc apply -f examplepod.yaml
$ oc apply -f examplepod.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Verify that the pod is created by running the following command:
oc get pod -n sysctl-tuning-test
$ oc get pod -n sysctl-tuning-testCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
NAME READY STATUS RESTARTS AGE tunepod 1/1 Running 0 47s
NAME READY STATUS RESTARTS AGE tunepod 1/1 Running 0 47sCopy to Clipboard Copied! Toggle word wrap Toggle overflow Log in to the pod by running the following command:
oc rsh -n sysctl-tuning-test tunepod
$ oc rsh -n sysctl-tuning-test tunepodCopy to Clipboard Copied! Toggle word wrap Toggle overflow Verify the values of the configured
sysctlflag. Find the valuenet.ipv6.neigh.IFNAME.base_reachable_time_msby running the following command::sysctl net.ipv6.neigh.bond0.base_reachable_time_ms
$ sysctl net.ipv6.neigh.bond0.base_reachable_time_msCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
net.ipv6.neigh.bond0.base_reachable_time_ms = 20000
net.ipv6.neigh.bond0.base_reachable_time_ms = 20000Copy to Clipboard Copied! Toggle word wrap Toggle overflow
22.9. Using high performance multicast Copy linkLink copied to clipboard!
You can use multicast on your Single Root I/O Virtualization (SR-IOV) hardware network.
22.9.1. High performance multicast Copy linkLink copied to clipboard!
The OpenShift SDN network plugin supports multicast between pods on the default network. This is best used for low-bandwidth coordination or service discovery, and not high-bandwidth applications. For applications such as streaming media, like Internet Protocol television (IPTV) and multipoint videoconferencing, you can utilize Single Root I/O Virtualization (SR-IOV) hardware to provide near-native performance.
When using additional SR-IOV interfaces for multicast:
- Multicast packages must be sent or received by a pod through the additional SR-IOV interface.
- The physical network which connects the SR-IOV interfaces decides the multicast routing and topology, which is not controlled by OpenShift Container Platform.
22.9.2. Configuring an SR-IOV interface for multicast Copy linkLink copied to clipboard!
The follow procedure creates an example SR-IOV interface for multicast.
Prerequisites
-
Install the OpenShift CLI (
oc). -
You must log in to the cluster with a user that has the
cluster-adminrole.
Procedure
Create a
SriovNetworkNodePolicyobject:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create a
SriovNetworkobject:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create a pod with multicast application:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- The
NET_ADMINcapability is required only if your application needs to assign the multicast IP address to the SR-IOV interface. Otherwise, it can be omitted.
22.10. Using DPDK and RDMA Copy linkLink copied to clipboard!
The containerized Data Plane Development Kit (DPDK) application is supported on OpenShift Container Platform. You can use Single Root I/O Virtualization (SR-IOV) network hardware with the Data Plane Development Kit (DPDK) and with remote direct memory access (RDMA).
For information on supported devices, refer to Supported devices.
22.10.1. Using a virtual function in DPDK mode with an Intel NIC Copy linkLink copied to clipboard!
Prerequisites
-
Install the OpenShift CLI (
oc). - Install the SR-IOV Network Operator.
-
Log in as a user with
cluster-adminprivileges.
Procedure
Create the following
SriovNetworkNodePolicyobject, and then save the YAML in theintel-dpdk-node-policy.yamlfile.Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Specify the driver type for the virtual functions to
vfio-pci.
NoteSee the
Configuring SR-IOV network devicessection for a detailed explanation on each option inSriovNetworkNodePolicy.When applying the configuration specified in a
SriovNetworkNodePolicyobject, the SR-IOV Operator may drain the nodes, and in some cases, reboot nodes. It may take several minutes for a configuration change to apply. Ensure that there are enough available nodes in your cluster to handle the evicted workload beforehand.After the configuration update is applied, all the pods in
openshift-sriov-network-operatornamespace will change to aRunningstatus.Create the
SriovNetworkNodePolicyobject by running the following command:oc create -f intel-dpdk-node-policy.yaml
$ oc create -f intel-dpdk-node-policy.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Create the following
SriovNetworkobject, and then save the YAML in theintel-dpdk-network.yamlfile.Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Specify a configuration object for the ipam CNI plugin as a YAML block scalar. The plugin manages IP address assignment for the attachment definition.
NoteSee the "Configuring SR-IOV additional network" section for a detailed explanation on each option in
SriovNetwork.An optional library, app-netutil, provides several API methods for gathering network information about a container’s parent pod.
Create the
SriovNetworkobject by running the following command:oc create -f intel-dpdk-network.yaml
$ oc create -f intel-dpdk-network.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Create the following
Podspec, and then save the YAML in theintel-dpdk-pod.yamlfile.Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Specify the same
target_namespacewhere theSriovNetworkobjectintel-dpdk-networkis created. If you would like to create the pod in a different namespace, changetarget_namespacein both thePodspec and theSriovNetworkobject. - 2
- Specify the DPDK image which includes your application and the DPDK library used by application.
- 3
- Specify additional capabilities required by the application inside the container for hugepage allocation, system resource allocation, and network interface access.
- 4
- Mount a hugepage volume to the DPDK pod under
/mnt/huge. The hugepage volume is backed by the emptyDir volume type with the medium beingHugepages. - 5
- Optional: Specify the number of DPDK devices allocated to DPDK pod. This resource request and limit, if not explicitly specified, will be automatically added by the SR-IOV network resource injector. The SR-IOV network resource injector is an admission controller component managed by the SR-IOV Operator. It is enabled by default and can be disabled by setting
enableInjectoroption tofalsein the defaultSriovOperatorConfigCR. - 6
- Specify the number of CPUs. The DPDK pod usually requires exclusive CPUs to be allocated from the kubelet. This is achieved by setting CPU Manager policy to
staticand creating a pod withGuaranteedQoS. - 7
- Specify hugepage size
hugepages-1Giorhugepages-2Miand the quantity of hugepages that will be allocated to the DPDK pod. Configure2Miand1Gihugepages separately. Configuring1Gihugepage requires adding kernel arguments to Nodes. For example, adding kernel argumentsdefault_hugepagesz=1GB,hugepagesz=1Gandhugepages=16will result in16*1Gihugepages be allocated during system boot.
Create the DPDK pod by running the following command:
oc create -f intel-dpdk-pod.yaml
$ oc create -f intel-dpdk-pod.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
22.10.2. Using a virtual function in DPDK mode with a Mellanox NIC Copy linkLink copied to clipboard!
You can create a network node policy and create a Data Plane Development Kit (DPDK) pod using a virtual function in DPDK mode with a Mellanox NIC.
Prerequisites
-
You have installed the OpenShift CLI (
oc). - You have installed the Single Root I/O Virtualization (SR-IOV) Network Operator.
-
You have logged in as a user with
cluster-adminprivileges.
Procedure
Save the following
SriovNetworkNodePolicyYAML configuration to anmlx-dpdk-node-policy.yamlfile:Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Specify the device hex code of the SR-IOV network device.
- 2
- Specify the driver type for the virtual functions to
netdevice. A Mellanox SR-IOV Virtual Function (VF) can work in DPDK mode without using thevfio-pcidevice type. The VF device appears as a kernel network interface inside a container. - 3
- Enable Remote Direct Memory Access (RDMA) mode. This is required for Mellanox cards to work in DPDK mode.
NoteSee Configuring an SR-IOV network device for a detailed explanation of each option in the
SriovNetworkNodePolicyobject.When applying the configuration specified in an
SriovNetworkNodePolicyobject, the SR-IOV Operator might drain the nodes, and in some cases, reboot nodes. It might take several minutes for a configuration change to apply. Ensure that there are enough available nodes in your cluster to handle the evicted workload beforehand.After the configuration update is applied, all the pods in the
openshift-sriov-network-operatornamespace will change to aRunningstatus.Create the
SriovNetworkNodePolicyobject by running the following command:oc create -f mlx-dpdk-node-policy.yaml
$ oc create -f mlx-dpdk-node-policy.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Save the following
SriovNetworkYAML configuration to anmlx-dpdk-network.yamlfile:Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Specify a configuration object for the IP Address Management (IPAM) Container Network Interface (CNI) plugin as a YAML block scalar. The plugin manages IP address assignment for the attachment definition.
NoteSee Configuring an SR-IOV network device for a detailed explanation on each option in the
SriovNetworkobject.The
app-netutiloption library provides several API methods for gathering network information about the parent pod of a container.Create the
SriovNetworkobject by running the following command:oc create -f mlx-dpdk-network.yaml
$ oc create -f mlx-dpdk-network.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Save the following
PodYAML configuration to anmlx-dpdk-pod.yamlfile:Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Specify the same
target_namespacewhereSriovNetworkobjectmlx-dpdk-networkis created. To create the pod in a different namespace, changetarget_namespacein both thePodspec andSriovNetworkobject. - 2
- Specify the DPDK image which includes your application and the DPDK library used by the application.
- 3
- Specify additional capabilities required by the application inside the container for hugepage allocation, system resource allocation, and network interface access.
- 4
- Mount the hugepage volume to the DPDK pod under
/mnt/huge. The hugepage volume is backed by theemptyDirvolume type with the medium beingHugepages. - 5
- Optional: Specify the number of DPDK devices allocated for the DPDK pod. If not explicitly specified, this resource request and limit is automatically added by the SR-IOV network resource injector. The SR-IOV network resource injector is an admission controller component managed by SR-IOV Operator. It is enabled by default and can be disabled by setting the
enableInjectoroption tofalsein the defaultSriovOperatorConfigCR. - 6
- Specify the number of CPUs. The DPDK pod usually requires that exclusive CPUs be allocated from the kubelet. To do this, set the CPU Manager policy to
staticand create a pod withGuaranteedQuality of Service (QoS). - 7
- Specify hugepage size
hugepages-1Giorhugepages-2Miand the quantity of hugepages that will be allocated to the DPDK pod. Configure2Miand1Gihugepages separately. Configuring1Gihugepages requires adding kernel arguments to Nodes.
Create the DPDK pod by running the following command:
oc create -f mlx-dpdk-pod.yaml
$ oc create -f mlx-dpdk-pod.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
22.10.3. Overview of achieving a specific DPDK line rate Copy linkLink copied to clipboard!
To achieve a specific Data Plane Development Kit (DPDK) line rate, deploy a Node Tuning Operator and configure Single Root I/O Virtualization (SR-IOV). You must also tune the DPDK settings for the following resources:
- Isolated CPUs
- Hugepages
- The topology scheduler
In previous versions of OpenShift Container Platform, the Performance Addon Operator was used to implement automatic tuning to achieve low latency performance for OpenShift Container Platform applications. In OpenShift Container Platform 4.11 and later, this functionality is part of the Node Tuning Operator.
DPDK test environment
The following diagram shows the components of a traffic-testing environment:
- Traffic generator: An application that can generate high-volume packet traffic.
- SR-IOV-supporting NIC: A network interface card compatible with SR-IOV. The card runs a number of virtual functions on a physical interface.
- Physical Function (PF): A PCI Express (PCIe) function of a network adapter that supports the SR-IOV interface.
- Virtual Function (VF): A lightweight PCIe function on a network adapter that supports SR-IOV. The VF is associated with the PCIe PF on the network adapter. The VF represents a virtualized instance of the network adapter.
- Switch: A network switch. Nodes can also be connected back-to-back.
-
testpmd: An example application included with DPDK. Thetestpmdapplication can be used to test the DPDK in a packet-forwarding mode. Thetestpmdapplication is also an example of how to build a fully-fledged application using the DPDK Software Development Kit (SDK). - worker 0 and worker 1: OpenShift Container Platform nodes.
22.10.4. Using SR-IOV and the Node Tuning Operator to achieve a DPDK line rate Copy linkLink copied to clipboard!
You can use the Node Tuning Operator to configure isolated CPUs, hugepages, and a topology scheduler. You can then use the Node Tuning Operator with Single Root I/O Virtualization (SR-IOV) to achieve a specific Data Plane Development Kit (DPDK) line rate.
Prerequisites
-
You have installed the OpenShift CLI (
oc). - You have installed the SR-IOV Network Operator.
-
You have logged in as a user with
cluster-adminprivileges. You have deployed a standalone Node Tuning Operator.
NoteIn previous versions of OpenShift Container Platform, the Performance Addon Operator was used to implement automatic tuning to achieve low latency performance for OpenShift applications. In OpenShift Container Platform 4.11 and later, this functionality is part of the Node Tuning Operator.
Procedure
Create a
PerformanceProfileobject based on the following example:Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- If hyperthreading is enabled on the system, allocate the relevant symbolic links to the
isolatedandreservedCPU groups. If the system contains multiple non-uniform memory access nodes (NUMAs), allocate CPUs from both NUMAs to both groups. You can also use the Performance Profile Creator for this task. For more information, see Creating a performance profile. - 2
- You can also specify a list of devices that will have their queues set to the reserved CPU count. For more information, see Reducing NIC queues using the Node Tuning Operator.
- 3
- Allocate the number and size of hugepages needed. You can specify the NUMA configuration for the hugepages. By default, the system allocates an even number to every NUMA node on the system. If needed, you can request the use of a realtime kernel for the nodes. See Provisioning a worker with real-time capabilities for more information.
-
Save the
yamlfile asmlx-dpdk-perfprofile-policy.yaml. Apply the performance profile using the following command:
oc create -f mlx-dpdk-perfprofile-policy.yaml
$ oc create -f mlx-dpdk-perfprofile-policy.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
22.10.4.1. Example SR-IOV Network Operator for virtual functions Copy linkLink copied to clipboard!
You can use the Single Root I/O Virtualization (SR-IOV) Network Operator to allocate and configure Virtual Functions (VFs) from SR-IOV-supporting Physical Function NICs on the nodes.
For more information on deploying the Operator, see Installing the SR-IOV Network Operator. For more information on configuring an SR-IOV network device, see Configuring an SR-IOV network device.
There are some differences between running Data Plane Development Kit (DPDK) workloads on Intel VFs and Mellanox VFs. This section provides object configuration examples for both VF types. The following is an example of an sriovNetworkNodePolicy object used to run DPDK applications on Intel NICs:
The following is an example of an sriovNetworkNodePolicy object for Mellanox NICs:
22.10.4.2. Example SR-IOV network operator Copy linkLink copied to clipboard!
The following is an example definition of an sriovNetwork object. In this case, Intel and Mellanox configurations are identical:
- 1
- You can use a different IP Address Management (IPAM) implementation, such as Whereabouts. For more information, see Dynamic IP address assignment configuration with Whereabouts.
- 2
- You must request the
networkNamespacewhere the network attachment definition will be created. You must create thesriovNetworkCR under theopenshift-sriov-network-operatornamespace. - 3
- The
resourceNamevalue must match that of theresourceNamecreated under thesriovNetworkNodePolicy.
22.10.4.3. Example DPDK base workload Copy linkLink copied to clipboard!
The following is an example of a Data Plane Development Kit (DPDK) container:
- 1
- Request the SR-IOV networks you need. Resources for the devices will be injected automatically.
- 2
- Disable the CPU and IRQ load balancing base. See Disabling interrupt processing for individual pods for more information.
- 3
- Set the
runtimeClasstoperformance-performance. Do not set theruntimeClasstoHostNetworkorprivileged. - 4
- Request an equal number of resources for requests and limits to start the pod with
GuaranteedQuality of Service (QoS).
Do not start the pod with SLEEP and then exec into the pod to start the testpmd or the DPDK workload. This can add additional interrupts as the exec process is not pinned to any CPU.
22.10.4.4. Example testpmd script Copy linkLink copied to clipboard!
The following is an example script for running testpmd:
This example uses two different sriovNetwork CRs. The environment variable contains the Virtual Function (VF) PCI address that was allocated for the pod. If you use the same network in the pod definition, you must split the pciAddress. It is important to configure the correct MAC addresses of the traffic generator. This example uses custom MAC addresses.
22.10.5. Using a virtual function in RDMA mode with a Mellanox NIC Copy linkLink copied to clipboard!
RDMA over Converged Ethernet (RoCE) is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.
For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.
RDMA over Converged Ethernet (RoCE) is the only supported mode when using RDMA on OpenShift Container Platform.
Prerequisites
-
Install the OpenShift CLI (
oc). - Install the SR-IOV Network Operator.
-
Log in as a user with
cluster-adminprivileges.
Procedure
Create the following
SriovNetworkNodePolicyobject, and then save the YAML in themlx-rdma-node-policy.yamlfile.Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteSee the
Configuring SR-IOV network devicessection for a detailed explanation on each option inSriovNetworkNodePolicy.When applying the configuration specified in a
SriovNetworkNodePolicyobject, the SR-IOV Operator may drain the nodes, and in some cases, reboot nodes. It may take several minutes for a configuration change to apply. Ensure that there are enough available nodes in your cluster to handle the evicted workload beforehand.After the configuration update is applied, all the pods in the
openshift-sriov-network-operatornamespace will change to aRunningstatus.Create the
SriovNetworkNodePolicyobject by running the following command:oc create -f mlx-rdma-node-policy.yaml
$ oc create -f mlx-rdma-node-policy.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Create the following
SriovNetworkobject, and then save the YAML in themlx-rdma-network.yamlfile.Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Specify a configuration object for the ipam CNI plugin as a YAML block scalar. The plugin manages IP address assignment for the attachment definition.
NoteSee the "Configuring SR-IOV additional network" section for a detailed explanation on each option in
SriovNetwork.An optional library, app-netutil, provides several API methods for gathering network information about a container’s parent pod.
Create the
SriovNetworkNodePolicyobject by running the following command:oc create -f mlx-rdma-network.yaml
$ oc create -f mlx-rdma-network.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Create the following
Podspec, and then save the YAML in themlx-rdma-pod.yamlfile.Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Specify the same
target_namespacewhereSriovNetworkobjectmlx-rdma-networkis created. If you would like to create the pod in a different namespace, changetarget_namespacein bothPodspec andSriovNetworkobject. - 2
- Specify the RDMA image which includes your application and RDMA library used by application.
- 3
- Specify additional capabilities required by the application inside the container for hugepage allocation, system resource allocation, and network interface access.
- 4
- Mount the hugepage volume to RDMA pod under
/mnt/huge. The hugepage volume is backed by the emptyDir volume type with the medium beingHugepages. - 5
- Specify number of CPUs. The RDMA pod usually requires exclusive CPUs be allocated from the kubelet. This is achieved by setting CPU Manager policy to
staticand create pod withGuaranteedQoS. - 6
- Specify hugepage size
hugepages-1Giorhugepages-2Miand the quantity of hugepages that will be allocated to the RDMA pod. Configure2Miand1Gihugepages separately. Configuring1Gihugepage requires adding kernel arguments to Nodes.
Create the RDMA pod by running the following command:
oc create -f mlx-rdma-pod.yaml
$ oc create -f mlx-rdma-pod.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
22.10.6. A test pod template for clusters that use OVS-DPDK on OpenStack Copy linkLink copied to clipboard!
The following testpmd pod demonstrates container creation with huge pages, reserved CPUs, and the SR-IOV port.
An example testpmd pod
22.10.7. A test pod template for clusters that use OVS hardware offloading on OpenStack Copy linkLink copied to clipboard!
The following testpmd pod demonstrates Open vSwitch (OVS) hardware offloading on Red Hat OpenStack Platform (RHOSP).
An example testpmd pod
- 1
- If your performance profile is not named
cnf-performance profile, replace that string with the correct performance profile name.
22.11. Using pod-level bonding Copy linkLink copied to clipboard!
Bonding at the pod level is vital to enable workloads inside pods that require high availability and more throughput. With pod-level bonding, you can create a bond interface from multiple single root I/O virtualization (SR-IOV) virtual function interfaces in a kernel mode interface. The SR-IOV virtual functions are passed into the pod and attached to a kernel driver.
One scenario where pod level bonding is required is creating a bond interface from multiple SR-IOV virtual functions on different physical functions. Creating a bond interface from two different physical functions on the host can be used to achieve high availability and throughput at pod level.
For guidance on tasks such as creating a SR-IOV network, network policies, network attachment definitions and pods, see Configuring an SR-IOV network device.
22.11.1. Configuring a bond interface from two SR-IOV interfaces Copy linkLink copied to clipboard!
Bonding enables multiple network interfaces to be aggregated into a single logical "bonded" interface. Bond Container Network Interface (Bond-CNI) brings bond capability into containers.
Bond-CNI can be created using Single Root I/O Virtualization (SR-IOV) virtual functions and placing them in the container network namespace.
OpenShift Container Platform only supports Bond-CNI using SR-IOV virtual functions. The SR-IOV Network Operator provides the SR-IOV CNI plugin needed to manage the virtual functions. Other CNIs or types of interfaces are not supported.
Prerequisites
- The SR-IOV Network Operator must be installed and configured to obtain virtual functions in a container.
- To configure SR-IOV interfaces, an SR-IOV network and policy must be created for each interface.
- The SR-IOV Network Operator creates a network attachment definition for each SR-IOV interface, based on the SR-IOV network and policy defined.
-
The
linkStateis set to the default valueautofor the SR-IOV virtual function.
22.11.1.1. Creating a bond network attachment definition Copy linkLink copied to clipboard!
Now that the SR-IOV virtual functions are available, you can create a bond network attachment definition.
- 1
- The cni-type is always set to
bond. - 2
- The
modeattribute specifies the bonding mode.NoteThe bonding modes supported are:
-
balance-rr- 0 -
active-backup- 1 -
balance-xor- 2
For
balance-rrorbalance-xormodes, you must set thetrustmode toonfor the SR-IOV virtual function. -
- 3
- The
failoverattribute is mandatory for active-backup mode and must be set to 1. - 4
- The
linksInContainer=trueflag informs the Bond CNI that the required interfaces are to be found inside the container. By default, Bond CNI looks for these interfaces on the host which does not work for integration with SRIOV and Multus. - 5
- The
linkssection defines which interfaces will be used to create the bond. By default, Multus names the attached interfaces as: "net", plus a consecutive number, starting with one.
22.11.1.2. Creating a pod using a bond interface Copy linkLink copied to clipboard!
Test the setup by creating a pod with a YAML file named for example
podbonding.yamlwith content similar to the following:Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Note the network annotation: it contains two SR-IOV network attachments, and one bond network attachment. The bond attachment uses the two SR-IOV interfaces as bonded port interfaces.
Apply the yaml by running the following command:
oc apply -f podbonding.yaml
$ oc apply -f podbonding.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Inspect the pod interfaces with the following command:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteIf no interface names are configured in the pod annotation, interface names are assigned automatically as
net<n>, with<n>starting at1.Optional: If you want to set a specific interface name for example
bond0, edit thek8s.v1.cni.cncf.io/networksannotation and setbond0as the interface name as follows:annotations: k8s.v1.cni.cncf.io/networks: demo/sriovnet1, demo/sriovnet2, demo/bond-net1@bond0annotations: k8s.v1.cni.cncf.io/networks: demo/sriovnet1, demo/sriovnet2, demo/bond-net1@bond0Copy to Clipboard Copied! Toggle word wrap Toggle overflow
22.12. Configuring hardware offloading Copy linkLink copied to clipboard!
As a cluster administrator, you can configure hardware offloading on compatible nodes to increase data processing performance and reduce load on host CPUs.
22.12.1. About hardware offloading Copy linkLink copied to clipboard!
Open vSwitch hardware offloading is a method of processing network tasks by diverting them away from the CPU and offloading them to a dedicated processor on a network interface controller. As a result, clusters can benefit from faster data transfer speeds, reduced CPU workloads, and lower computing costs.
The key element for this feature is a modern class of network interface controllers known as SmartNICs. A SmartNIC is a network interface controller that is able to handle computationally-heavy network processing tasks. In the same way that a dedicated graphics card can improve graphics performance, a SmartNIC can improve network performance. In each case, a dedicated processor improves performance for a specific type of processing task.
In OpenShift Container Platform, you can configure hardware offloading for bare metal nodes that have a compatible SmartNIC. Hardware offloading is configured and enabled by the SR-IOV Network Operator.
Hardware offloading is not compatible with all workloads or application types. Only the following two communication types are supported:
- pod-to-pod
- pod-to-service, where the service is a ClusterIP service backed by a regular pod
In all cases, hardware offloading takes place only when those pods and services are assigned to nodes that have a compatible SmartNIC. Suppose, for example, that a pod on a node with hardware offloading tries to communicate with a service on a regular node. On the regular node, all the processing takes place in the kernel, so the overall performance of the pod-to-service communication is limited to the maximum performance of that regular node. Hardware offloading is not compatible with DPDK applications.
Enabling hardware offloading on a node, but not configuring pods to use, it can result in decreased throughput performance for pod traffic. You cannot configure hardware offloading for pods that are managed by OpenShift Container Platform.
22.12.2. Supported devices Copy linkLink copied to clipboard!
Hardware offloading is supported on the following network interface controllers:
| Manufacturer | Model | Vendor ID | Device ID |
|---|---|---|---|
| Mellanox | MT27800 Family [ConnectX‑5] | 15b3 | 1017 |
| Mellanox | MT28880 Family [ConnectX‑5 Ex] | 15b3 | 1019 |
| Manufacturer | Model | Vendor ID | Device ID |
|---|---|---|---|
| Mellanox | MT2892 Family [ConnectX-6 Dx] | 15b3 | 101d |
| Mellanox | MT2894 Family [ConnectX-6 Lx] | 15b3 | 101f |
| Mellanox | MT42822 BlueField-2 in ConnectX-6 NIC mode | 15b3 | a2d6 |
Using a ConnectX-6 Lx or BlueField-2 in ConnectX-6 NIC mode device is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.
For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.
22.12.3. Prerequisites Copy linkLink copied to clipboard!
- Your cluster has at least one bare metal machine with a network interface controller that is supported for hardware offloading.
- You installed the SR-IOV Network Operator.
- Your cluster uses the OVN-Kubernetes network plugin.
-
In your OVN-Kubernetes network plugin configuration, the
gatewayConfig.routingViaHostfield is set tofalse.
22.12.4. Setting the SR-IOV Network Operator into systemd mode Copy linkLink copied to clipboard!
To support hardware offloading, you must first set the SR-IOV Network Operator into systemd mode.
Prerequisites
-
You installed the OpenShift CLI (
oc). -
You have access to the cluster as a user that has the
cluster-adminrole.
Procedure
Create a
SriovOperatorConfigcustom resource (CR) to deploy all the SR-IOV Operator components:Create a file named
sriovOperatorConfig.yamlthat contains the following YAML:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create the resource by running the following command:
oc apply -f sriovOperatorConfig.yaml
$ oc apply -f sriovOperatorConfig.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
22.12.5. Configuring a machine config pool for hardware offloading Copy linkLink copied to clipboard!
To enable hardware offloading, you now create a dedicated machine config pool and configure it to work with the SR-IOV Network Operator.
Prerequisites
-
SR-IOV Network Operator installed and set into
systemdmode.
Procedure
Create a machine config pool for machines you want to use hardware offloading on.
Create a file, such as
mcp-offloading.yaml, with content like the following example:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Apply the configuration for the machine config pool:
oc create -f mcp-offloading.yaml
$ oc create -f mcp-offloading.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Add nodes to the machine config pool. Label each node with the node role label of your pool:
oc label node worker-2 node-role.kubernetes.io/mcp-offloading=""
$ oc label node worker-2 node-role.kubernetes.io/mcp-offloading=""Copy to Clipboard Copied! Toggle word wrap Toggle overflow Optional: To verify that the new pool is created, run the following command:
oc get nodes
$ oc get nodesCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Add this machine config pool to the
SriovNetworkPoolConfigcustom resource:Create a file, such as
sriov-pool-config.yaml, with content like the following example:Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- The name of your machine config pool for hardware offloading.
Apply the configuration:
oc create -f <SriovNetworkPoolConfig_name>.yaml
$ oc create -f <SriovNetworkPoolConfig_name>.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow NoteWhen you apply the configuration specified in a
SriovNetworkPoolConfigobject, the SR-IOV Operator drains and restarts the nodes in the machine config pool.It might take several minutes for a configuration changes to apply.
22.12.6. Configuring the SR-IOV network node policy Copy linkLink copied to clipboard!
You can create an SR-IOV network device configuration for a node by creating an SR-IOV network node policy. To enable hardware offloading, you must define the .spec.eSwitchMode field with the value "switchdev".
The following procedure creates an SR-IOV interface for a network interface controller with hardware offloading.
Prerequisites
-
You installed the OpenShift CLI (
oc). -
You have access to the cluster as a user with the
cluster-adminrole.
Procedure
Create a file, such as
sriov-node-policy.yaml, with content like the following example:Copy to Clipboard Copied! Toggle word wrap Toggle overflow <.> The name for the custom resource object. <.> Required. Hardware offloading is not supported with
vfio-pci. <.> Required.Apply the configuration for the policy:
oc create -f sriov-node-policy.yaml
$ oc create -f sriov-node-policy.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow NoteWhen you apply the configuration specified in a
SriovNetworkPoolConfigobject, the SR-IOV Operator drains and restarts the nodes in the machine config pool.It might take several minutes for a configuration change to apply.
22.12.6.1. An example SR-IOV network node policy for OpenStack Copy linkLink copied to clipboard!
The following example describes an SR-IOV interface for a network interface controller (NIC) with hardware offloading on Red Hat OpenStack Platform (RHOSP).
An SR-IOV interface for a NIC with hardware offloading on RHOSP
22.12.7. Creating a network attachment definition Copy linkLink copied to clipboard!
After you define the machine config pool and the SR-IOV network node policy, you can create a network attachment definition for the network interface card you specified.
Prerequisites
-
You installed the OpenShift CLI (
oc). -
You have access to the cluster as a user with the
cluster-adminrole.
Procedure
Create a file, such as
net-attach-def.yaml, with content like the following example:Copy to Clipboard Copied! Toggle word wrap Toggle overflow <.> The name for your network attachment definition. <.> The namespace for your network attachment definition. <.> This is the value of the
spec.resourceNamefield you specified in theSriovNetworkNodePolicyobject.Apply the configuration for the network attachment definition:
oc create -f net-attach-def.yaml
$ oc create -f net-attach-def.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Verification
Run the following command to see whether the new definition is present:
oc get net-attach-def -A
$ oc get net-attach-def -ACopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
NAMESPACE NAME AGE net-attach-def net-attach-def 43h
NAMESPACE NAME AGE net-attach-def net-attach-def 43hCopy to Clipboard Copied! Toggle word wrap Toggle overflow
22.12.8. Adding the network attachment definition to your pods Copy linkLink copied to clipboard!
After you create the machine config pool, the SriovNetworkPoolConfig and SriovNetworkNodePolicy custom resources, and the network attachment definition, you can apply these configurations to your pods by adding the network attachment definition to your pod specifications.
Procedure
In the pod specification, add the
.metadata.annotations.k8s.v1.cni.cncf.io/networksfield and specify the network attachment definition you created for hardware offloading:.... metadata: annotations: v1.multus-cni.io/default-network: net-attach-def/net-attach-def <.>.... metadata: annotations: v1.multus-cni.io/default-network: net-attach-def/net-attach-def <.>Copy to Clipboard Copied! Toggle word wrap Toggle overflow <.> The value must be the name and namespace of the network attachment definition you created for hardware offloading.
22.13. Switching Bluefield-2 from DPU to NIC Copy linkLink copied to clipboard!
You can switch the Bluefield-2 network device from data processing unit (DPU) mode to network interface controller (NIC) mode.
Switching Bluefield-2 from data processing unit (DPU) mode to network interface controller (NIC) mode is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.
For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.
22.13.1. Switching Bluefield-2 from DPU mode to NIC mode Copy linkLink copied to clipboard!
Use the following procedure to switch Bluefield-2 from data processing units (DPU) mode to network interface controller (NIC) mode.
Currently, only switching Bluefield-2 from DPU to NIC mode is supported. Switching from NIC mode to DPU mode is unsupported.
Prerequisites
- You have installed the SR-IOV Network Operator. For more information, see "Installing SR-IOV Network Operator".
- You have updated Bluefield-2 to the latest firmware. For more information, see Firmware for NVIDIA BlueField-2.
Procedure
Add the following labels to each of your worker nodes by entering the following commands:
oc label node <example_node_name_one> node-role.kubernetes.io/sriov=
$ oc label node <example_node_name_one> node-role.kubernetes.io/sriov=Copy to Clipboard Copied! Toggle word wrap Toggle overflow oc label node <example_node_name_two> node-role.kubernetes.io/sriov=
$ oc label node <example_node_name_two> node-role.kubernetes.io/sriov=Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create a machine config pool for the SR-IOV Operator, for example:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Apply the following
machineconfig.yamlfile to the worker nodes:Copy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Optional: The PCI address of a specific card can optionally be specified, for example
ExecStart=/bin/bash -c '/etc/default/switch_in_sriov_config_daemon.sh nic 0000:5e:00.0 || echo done'. By default, the first device is selected. If there is more than one device, you must specify which PCI address to be used. The PCI address must be the same on all nodes that are switching Bluefield-2 from DPU mode to NIC mode.
- Wait for the worker nodes to restart. After restarting, the Bluefield-2 network device on the worker nodes is switched into NIC mode.
- Optional: You might need to restart the host hardware because most recent Bluefield-2 firmware releases require a hardware restart to switch into NIC mode.
22.14. Uninstalling the SR-IOV Network Operator Copy linkLink copied to clipboard!
To uninstall the SR-IOV Network Operator, you must delete any running SR-IOV workloads, uninstall the Operator, and delete the webhooks that the Operator used.
22.14.1. Uninstalling the SR-IOV Network Operator Copy linkLink copied to clipboard!
As a cluster administrator, you can uninstall the SR-IOV Network Operator.
Prerequisites
-
You have access to an OpenShift Container Platform cluster using an account with
cluster-adminpermissions. - You have the SR-IOV Network Operator installed.
Procedure
Delete all SR-IOV custom resources (CRs):
oc delete sriovnetwork -n openshift-sriov-network-operator --all
$ oc delete sriovnetwork -n openshift-sriov-network-operator --allCopy to Clipboard Copied! Toggle word wrap Toggle overflow oc delete sriovnetworknodepolicy -n openshift-sriov-network-operator --all
$ oc delete sriovnetworknodepolicy -n openshift-sriov-network-operator --allCopy to Clipboard Copied! Toggle word wrap Toggle overflow oc delete sriovibnetwork -n openshift-sriov-network-operator --all
$ oc delete sriovibnetwork -n openshift-sriov-network-operator --allCopy to Clipboard Copied! Toggle word wrap Toggle overflow - Follow the instructions in the "Deleting Operators from a cluster" section to remove the SR-IOV Network Operator from your cluster.
Delete the SR-IOV custom resource definitions that remain in the cluster after the SR-IOV Network Operator is uninstalled:
oc delete crd sriovibnetworks.sriovnetwork.openshift.io
$ oc delete crd sriovibnetworks.sriovnetwork.openshift.ioCopy to Clipboard Copied! Toggle word wrap Toggle overflow oc delete crd sriovnetworknodepolicies.sriovnetwork.openshift.io
$ oc delete crd sriovnetworknodepolicies.sriovnetwork.openshift.ioCopy to Clipboard Copied! Toggle word wrap Toggle overflow oc delete crd sriovnetworknodestates.sriovnetwork.openshift.io
$ oc delete crd sriovnetworknodestates.sriovnetwork.openshift.ioCopy to Clipboard Copied! Toggle word wrap Toggle overflow oc delete crd sriovnetworkpoolconfigs.sriovnetwork.openshift.io
$ oc delete crd sriovnetworkpoolconfigs.sriovnetwork.openshift.ioCopy to Clipboard Copied! Toggle word wrap Toggle overflow oc delete crd sriovnetworks.sriovnetwork.openshift.io
$ oc delete crd sriovnetworks.sriovnetwork.openshift.ioCopy to Clipboard Copied! Toggle word wrap Toggle overflow oc delete crd sriovoperatorconfigs.sriovnetwork.openshift.io
$ oc delete crd sriovoperatorconfigs.sriovnetwork.openshift.ioCopy to Clipboard Copied! Toggle word wrap Toggle overflow Delete the SR-IOV webhooks:
oc delete mutatingwebhookconfigurations network-resources-injector-config
$ oc delete mutatingwebhookconfigurations network-resources-injector-configCopy to Clipboard Copied! Toggle word wrap Toggle overflow oc delete MutatingWebhookConfiguration sriov-operator-webhook-config
$ oc delete MutatingWebhookConfiguration sriov-operator-webhook-configCopy to Clipboard Copied! Toggle word wrap Toggle overflow oc delete ValidatingWebhookConfiguration sriov-operator-webhook-config
$ oc delete ValidatingWebhookConfiguration sriov-operator-webhook-configCopy to Clipboard Copied! Toggle word wrap Toggle overflow Delete the SR-IOV Network Operator namespace:
oc delete namespace openshift-sriov-network-operator
$ oc delete namespace openshift-sriov-network-operatorCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Chapter 23. OVN-Kubernetes network plugin Copy linkLink copied to clipboard!
23.1. About the OVN-Kubernetes network plugin Copy linkLink copied to clipboard!
The OpenShift Container Platform cluster uses a virtualized network for pod and service networks.
Part of Red Hat OpenShift Networking, the OVN-Kubernetes network plugin is the default network provider for OpenShift Container Platform. OVN-Kubernetes is based on Open Virtual Network (OVN) and provides an overlay-based networking implementation. A cluster that uses the OVN-Kubernetes plugin also runs Open vSwitch (OVS) on each node. OVN configures OVS on each node to implement the declared network configuration.
OVN-Kubernetes is the default networking solution for OpenShift Container Platform and single-node OpenShift deployments.
OVN-Kubernetes, which arose from the OVS project, uses many of the same constructs, such as open flow rules, to determine how packets travel through the network. For more information, see the Open Virtual Network website.
OVN-Kubernetes is a series of daemons for OVS that translate virtual network configurations into OpenFlow rules. OpenFlow is a protocol for communicating with network switches and routers, providing a means for remotely controlling the flow of network traffic on a network device, allowing network administrators to configure, manage, and monitor the flow of network traffic.
OVN-Kubernetes provides more of the advanced functionality not available with OpenFlow. OVN supports distributed virtual routing, distributed logical switches, access control, DHCP and DNS. OVN implements distributed virtual routing within logic flows which equate to open flows. So for example if you have a pod that sends out a DHCP request on the network, it sends out that broadcast looking for DHCP address there will be a logic flow rule that matches that packet, and it responds giving it a gateway, a DNS server an IP address and so on.
OVN-Kubernetes runs a daemon on each node. There are daemon sets for the databases and for the OVN controller that run on every node. The OVN controller programs the Open vSwitch daemon on the nodes to support the network provider features; egress IPs, firewalls, routers, hybrid networking, IPSEC encryption, IPv6, network policy, network policy logs, hardware offloading and multicast.
23.1.1. OVN-Kubernetes purpose Copy linkLink copied to clipboard!
The OVN-Kubernetes network plugin is an open-source, fully-featured Kubernetes CNI plugin that uses Open Virtual Network (OVN) to manage network traffic flows. OVN is a community developed, vendor-agnostic network virtualization solution. The OVN-Kubernetes network plugin:
- Uses OVN (Open Virtual Network) to manage network traffic flows. OVN is a community developed, vendor-agnostic network virtualization solution.
- Implements Kubernetes network policy support, including ingress and egress rules.
- Uses the Geneve (Generic Network Virtualization Encapsulation) protocol rather than VXLAN to create an overlay network between nodes.
The OVN-Kubernetes network plugin provides the following advantages over OpenShift SDN.
- Full support for IPv6 single-stack and IPv4/IPv6 dual-stack networking on supported platforms
- Support for hybrid clusters with both Linux and Microsoft Windows workloads
- Optional IPsec encryption of intra-cluster communications
- Offload of network data processing from host CPU to compatible network cards and data processing units (DPUs)
23.1.2. Supported network plugin feature matrix Copy linkLink copied to clipboard!
Red Hat OpenShift Networking offers two options for the network plugin, OpenShift SDN and OVN-Kubernetes, for the network plugin. The following table summarizes the current feature support for both network plugins:
| Feature | OpenShift SDN | OVN-Kubernetes |
|---|---|---|
| Egress IPs | Supported | Supported |
| Egress firewall | Supported | Supported [1] |
| Egress router | Supported | Supported [2] |
| Hybrid networking | Not supported | Supported |
| IPsec encryption for intra-cluster communication | Not supported | Supported |
| IPv4 single-stack | Supported | Supported |
| IPv6 single-stack | Not supported | Supported [3] |
| IPv4/IPv6 dual-stack | Not Supported | Supported [4] |
| IPv6/IPv4 dual-stack | Not supported | Supported [5] |
| Kubernetes network policy | Supported | Supported |
| Kubernetes network policy logs | Not supported | Supported |
| Hardware offloading | Not supported | Supported |
| Multicast | Supported | Supported |
- Egress firewall is also known as egress network policy in OpenShift SDN. This is not the same as network policy egress.
- Egress router for OVN-Kubernetes supports only redirect mode.
- IPv6 single-stack networking on a bare-metal platform.
- IPv4/IPv6 dual-stack networking on bare-metal, IBM Power®, and IBM Z® platforms.
- IPv6/IPv4 dual-stack networking on bare-metal and IBM Power® platforms.
23.1.3. OVN-Kubernetes IPv6 and dual-stack limitations Copy linkLink copied to clipboard!
The OVN-Kubernetes network plugin has the following limitations:
For clusters configured for dual-stack networking, both IPv4 and IPv6 traffic must use the same network interface as the default gateway. If this requirement is not met, pods on the host in the
ovnkube-nodedaemon set enter theCrashLoopBackOffstate. If you display a pod with a command such asoc get pod -n openshift-ovn-kubernetes -l app=ovnkube-node -o yaml, thestatusfield contains more than one message about the default gateway, as shown in the following output:I1006 16:09:50.985852 60651 helper_linux.go:73] Found default gateway interface br-ex 192.168.127.1 I1006 16:09:50.985923 60651 helper_linux.go:73] Found default gateway interface ens4 fe80::5054:ff:febe:bcd4 F1006 16:09:50.985939 60651 ovnkube.go:130] multiple gateway interfaces detected: br-ex ens4
I1006 16:09:50.985852 60651 helper_linux.go:73] Found default gateway interface br-ex 192.168.127.1 I1006 16:09:50.985923 60651 helper_linux.go:73] Found default gateway interface ens4 fe80::5054:ff:febe:bcd4 F1006 16:09:50.985939 60651 ovnkube.go:130] multiple gateway interfaces detected: br-ex ens4Copy to Clipboard Copied! Toggle word wrap Toggle overflow The only resolution is to reconfigure the host networking so that both IP families use the same network interface for the default gateway.
For clusters configured for dual-stack networking, both the IPv4 and IPv6 routing tables must contain the default gateway. If this requirement is not met, pods on the host in the
ovnkube-nodedaemon set enter theCrashLoopBackOffstate. If you display a pod with a command such asoc get pod -n openshift-ovn-kubernetes -l app=ovnkube-node -o yaml, thestatusfield contains more than one message about the default gateway, as shown in the following output:I0512 19:07:17.589083 108432 helper_linux.go:74] Found default gateway interface br-ex 192.168.123.1 F0512 19:07:17.589141 108432 ovnkube.go:133] failed to get default gateway interface
I0512 19:07:17.589083 108432 helper_linux.go:74] Found default gateway interface br-ex 192.168.123.1 F0512 19:07:17.589141 108432 ovnkube.go:133] failed to get default gateway interfaceCopy to Clipboard Copied! Toggle word wrap Toggle overflow The only resolution is to reconfigure the host networking so that both IP families contain the default gateway.
23.1.4. Session affinity Copy linkLink copied to clipboard!
Session affinity is a feature that applies to Kubernetes Service objects. You can use session affinity if you want to ensure that each time you connect to a <service_VIP>:<Port>, the traffic is always load balanced to the same back end. For more information, including how to set session affinity based on a client’s IP address, see Session affinity.
Stickiness timeout for session affinity
The OVN-Kubernetes network plugin for OpenShift Container Platform calculates the stickiness timeout for a session from a client based on the last packet. For example, if you run a curl command 10 times, the sticky session timer starts from the tenth packet not the first. As a result, if the client is continuously contacting the service, then the session never times out. The timeout starts when the service has not received a packet for the amount of time set by the timeoutSeconds parameter.
23.2. OVN-Kubernetes architecture Copy linkLink copied to clipboard!
23.2.1. Introduction to OVN-Kubernetes architecture Copy linkLink copied to clipboard!
The following diagram shows the OVN-Kubernetes architecture.
Figure 23.1. OVK-Kubernetes architecture
The key components are:
- Cloud Management System (CMS) - A platform specific client for OVN that provides a CMS specific plugin for OVN integration. The plugin translates the cloud management system’s concept of the logical network configuration, stored in the CMS configuration database in a CMS-specific format, into an intermediate representation understood by OVN.
-
OVN Northbound database (
nbdb) - Stores the logical network configuration passed by the CMS plugin. -
OVN Southbound database (
sbdb) - Stores the physical and logical network configuration state for OpenVswitch (OVS) system on each node, including tables that bind them. -
ovn-northd - This is the intermediary client between
nbdbandsbdb. It translates the logical network configuration in terms of conventional network concepts, taken from thenbdb, into logical data path flows in thesbdbbelow it. The container name isnorthdand it runs in theovnkube-masterpods. -
ovn-controller - This is the OVN agent that interacts with OVS and hypervisors, for any information or update that is needed for
sbdb. Theovn-controllerreads logical flows from thesbdb, translates them intoOpenFlowflows and sends them to the node’s OVS daemon. The container name isovn-controllerand it runs in theovnkube-nodepods.
The OVN northbound database has the logical network configuration passed down to it by the cloud management system (CMS). The OVN northbound Database contains the current desired state of the network, presented as a collection of logical ports, logical switches, logical routers, and more. The ovn-northd (northd container) connects to the OVN northbound database and the OVN southbound database. It translates the logical network configuration in terms of conventional network concepts, taken from the OVN northbound Database, into logical data path flows in the OVN southbound database.
The OVN southbound database has physical and logical representations of the network and binding tables that link them together. Every node in the cluster is represented in the southbound database, and you can see the ports that are connected to it. It also contains all the logic flows, the logic flows are shared with the ovn-controller process that runs on each node and the ovn-controller turns those into OpenFlow rules to program Open vSwitch.
The Kubernetes control plane nodes each contain an ovnkube-master pod which hosts containers for the OVN northbound and southbound databases. All OVN northbound databases form a Raft cluster and all southbound databases form a separate Raft cluster. At any given time a single ovnkube-master is the leader and the other ovnkube-master pods are followers.
23.2.2. Listing all resources in the OVN-Kubernetes project Copy linkLink copied to clipboard!
Finding the resources and containers that run in the OVN-Kubernetes project is important to help you understand the OVN-Kubernetes networking implementation.
Prerequisites
-
Access to the cluster as a user with the
cluster-adminrole. -
The OpenShift CLI (
oc) installed.
Procedure
Run the following command to get all resources, endpoints, and
ConfigMapsin the OVN-Kubernetes project:oc get all,ep,cm -n openshift-ovn-kubernetes
$ oc get all,ep,cm -n openshift-ovn-kubernetesCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow There are three
ovnkube-mastersthat run on the control plane nodes, and two daemon sets used to deploy theovnkube-masterandovnkube-nodepods. There is oneovnkube-nodepod for each node in the cluster. In this example, there are 5, and since there is oneovnkube-nodeper node in the cluster, there are five nodes in the cluster. Theovnkube-configConfigMaphas the OpenShift Container Platform OVN-Kubernetes configurations started by online-master andovnkube-node. Theovn-kubernetes-masterConfigMaphas the information of the current online master leader.List all the containers in the
ovnkube-masterpods by running the following command:oc get pods ovnkube-master-9g7zt \ -o jsonpath='{.spec.containers[*].name}' -n openshift-ovn-kubernetes$ oc get pods ovnkube-master-9g7zt \ -o jsonpath='{.spec.containers[*].name}' -n openshift-ovn-kubernetesCopy to Clipboard Copied! Toggle word wrap Toggle overflow Expected output
northd nbdb kube-rbac-proxy sbdb ovnkube-master ovn-dbchecker
northd nbdb kube-rbac-proxy sbdb ovnkube-master ovn-dbcheckerCopy to Clipboard Copied! Toggle word wrap Toggle overflow The
ovnkube-masterpod is made up of several containers. It is responsible for hosting the northbound database (nbdbcontainer), the southbound database (sbdbcontainer), watching for cluster events for pods, egressIP, namespaces, services, endpoints, egress firewall, and network policy and writing them to the northbound database (ovnkube-masterpod), as well as managing pod subnet allocation to nodes.List all the containers in the
ovnkube-nodepods by running the following command:oc get pods ovnkube-node-jg52r \ -o jsonpath='{.spec.containers[*].name}' -n openshift-ovn-kubernetes$ oc get pods ovnkube-node-jg52r \ -o jsonpath='{.spec.containers[*].name}' -n openshift-ovn-kubernetesCopy to Clipboard Copied! Toggle word wrap Toggle overflow Expected output
ovn-controller ovn-acl-logging kube-rbac-proxy kube-rbac-proxy-ovn-metrics ovnkube-node
ovn-controller ovn-acl-logging kube-rbac-proxy kube-rbac-proxy-ovn-metrics ovnkube-nodeCopy to Clipboard Copied! Toggle word wrap Toggle overflow The
ovnkube-nodepod has a container (ovn-controller) that resides on each OpenShift Container Platform node. Each node’sovn-controllerconnects the OVN northbound to the OVN southbound database to learn about the OVN configuration. Theovn-controllerconnects southbound toovs-vswitchdas an OpenFlow controller, for control over network traffic, and to the localovsdb-serverto allow it to monitor and control Open vSwitch configuration.
23.2.3. Listing the OVN-Kubernetes northbound database contents Copy linkLink copied to clipboard!
To understand logic flow rules you need to examine the northbound database and understand what objects are there to see how they are translated into logic flow rules. The up to date information is present on the OVN Raft leader and this procedure describes how to find the Raft leader and subsequently query it to list the OVN northbound database contents.
Prerequisites
-
Access to the cluster as a user with the
cluster-adminrole. -
The OpenShift CLI (
oc) installed.
Procedure
Find the OVN Raft leader for the northbound database.
NoteThe Raft leader stores the most up to date information.
List the pods by running the following command:
oc get po -n openshift-ovn-kubernetes
$ oc get po -n openshift-ovn-kubernetesCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Choose one of the master pods at random and run the following command:
oc exec -n openshift-ovn-kubernetes ovnkube-master-7j97q \ -- /usr/bin/ovn-appctl -t /var/run/ovn/ovnnb_db.ctl \ --timeout=3 cluster/status OVN_Northbound
$ oc exec -n openshift-ovn-kubernetes ovnkube-master-7j97q \ -- /usr/bin/ovn-appctl -t /var/run/ovn/ovnnb_db.ctl \ --timeout=3 cluster/status OVN_NorthboundCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Find the
ovnkube-masterpod running on IP Address10.0.242.240using the following command:oc get po -o wide -n openshift-ovn-kubernetes | grep 10.0.242.240 | grep -v ovnkube-node
$ oc get po -o wide -n openshift-ovn-kubernetes | grep 10.0.242.240 | grep -v ovnkube-nodeCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
ovnkube-master-gt4ms 6/6 Running 1 (143m ago) 150m 10.0.242.240 ip-10-0-242-240.ec2.internal <none> <none>
ovnkube-master-gt4ms 6/6 Running 1 (143m ago) 150m 10.0.242.240 ip-10-0-242-240.ec2.internal <none> <none>Copy to Clipboard Copied! Toggle word wrap Toggle overflow The
ovnkube-master-gt4mspod runs on IP Address 10.0.242.240.
Run the following command to show all the objects in the northbound database:
oc exec -n openshift-ovn-kubernetes -it ovnkube-master-gt4ms \ -c northd -- ovn-nbctl show
$ oc exec -n openshift-ovn-kubernetes -it ovnkube-master-gt4ms \ -c northd -- ovn-nbctl showCopy to Clipboard Copied! Toggle word wrap Toggle overflow The output is too long to list here. The list includes the NAT rules, logical switches, load balancers and so on.
Run the following command to display the options available with the command
ovn-nbctl:oc exec -n openshift-ovn-kubernetes -it ovnkube-master-mk6p6 \ -c northd ovn-nbctl --help
$ oc exec -n openshift-ovn-kubernetes -it ovnkube-master-mk6p6 \ -c northd ovn-nbctl --helpCopy to Clipboard Copied! Toggle word wrap Toggle overflow You can narrow down and focus on specific components by using some of the following commands:
Run the following command to show the list of logical routers:
oc exec -n openshift-ovn-kubernetes -it ovnkube-master-gt4ms \ -c northd -- ovn-nbctl lr-list
$ oc exec -n openshift-ovn-kubernetes -it ovnkube-master-gt4ms \ -c northd -- ovn-nbctl lr-listCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteFrom this output you can see there is router on each node plus an
ovn_cluster_router.Run the following command to show the list of logical switches:
oc exec -n openshift-ovn-kubernetes -it ovnkube-master-gt4ms \ -c northd -- ovn-nbctl ls-list
$ oc exec -n openshift-ovn-kubernetes -it ovnkube-master-gt4ms \ -c northd -- ovn-nbctl ls-listCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteFrom this output you can see there is an ext switch for each node plus switches with the node name itself and a join switch.
Run the following command to show the list of load balancers:
oc exec -n openshift-ovn-kubernetes -it ovnkube-master-gt4ms \ -c northd -- ovn-nbctl lb-list
$ oc exec -n openshift-ovn-kubernetes -it ovnkube-master-gt4ms \ -c northd -- ovn-nbctl lb-listCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteFrom this truncated output you can see there are many OVN-Kubernetes load balancers. Load balancers in OVN-Kubernetes are representations of services.
23.2.4. Command-line arguments for ovn-nbctl to examine northbound database contents Copy linkLink copied to clipboard!
The following table describes the command-line arguments that can be used with ovn-nbctl to examine the contents of the northbound database.
| Argument | Description |
|---|---|
|
| An overview of the northbound database contents. |
|
| Show the details associated with the specified switch or router. |
|
| Show the logical routers. |
|
|
Using the router information from |
|
| Show network address translation details for the specified router. |
|
| Show the logical switches |
|
|
Using the switch information from |
|
| Get the type for the logical port. |
|
| Show the load balancers. |
23.2.5. Listing the OVN-Kubernetes southbound database contents Copy linkLink copied to clipboard!
Logic flow rules are stored in the southbound database that is a representation of your infrastructure. The up to date information is present on the OVN Raft leader and this procedure describes how to find the Raft leader and query it to list the OVN southbound database contents.
Prerequisites
-
Access to the cluster as a user with the
cluster-adminrole. -
The OpenShift CLI (
oc) installed.
Procedure
Find the OVN Raft leader for the southbound database.
NoteThe Raft leader stores the most up to date information.
List the pods by running the following command:
oc get po -n openshift-ovn-kubernetes
$ oc get po -n openshift-ovn-kubernetesCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Choose one of the master pods at random and run the following command to find the OVN southbound Raft leader:
oc exec -n openshift-ovn-kubernetes ovnkube-master-7j97q \ -- /usr/bin/ovn-appctl -t /var/run/ovn/ovnsb_db.ctl \ --timeout=3 cluster/status OVN_Southbound
$ oc exec -n openshift-ovn-kubernetes ovnkube-master-7j97q \ -- /usr/bin/ovn-appctl -t /var/run/ovn/ovnsb_db.ctl \ --timeout=3 cluster/status OVN_SouthboundCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Find the
ovnkube-masterpod running on IP Address10.0.163.212using the following command:oc get po -o wide -n openshift-ovn-kubernetes | grep 10.0.163.212 | grep -v ovnkube-node
$ oc get po -o wide -n openshift-ovn-kubernetes | grep 10.0.163.212 | grep -v ovnkube-nodeCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
ovnkube-master-mk6p6 6/6 Running 0 136m 10.0.163.212 ip-10-0-163-212.ec2.internal <none> <none>
ovnkube-master-mk6p6 6/6 Running 0 136m 10.0.163.212 ip-10-0-163-212.ec2.internal <none> <none>Copy to Clipboard Copied! Toggle word wrap Toggle overflow The
ovnkube-master-mk6p6pod runs on IP Address 10.0.163.212.
Run the following command to show all the information stored in the southbound database:
oc exec -n openshift-ovn-kubernetes -it ovnkube-master-mk6p6 \ -c northd -- ovn-sbctl show
$ oc exec -n openshift-ovn-kubernetes -it ovnkube-master-mk6p6 \ -c northd -- ovn-sbctl showCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow This detailed output shows the chassis and the ports that are attached to the chassis which in this case are all of the router ports and anything that runs like host networking. Any pods communicate out to the wider network using source network address translation (SNAT). Their IP address is translated into the IP address of the node that the pod is running on and then sent out into the network.
In addition to the chassis information the southbound database has all the logic flows and those logic flows are then sent to the
ovn-controllerrunning on each of the nodes. Theovn-controllertranslates the logic flows into open flow rules and ultimately programsOpenvSwitchso that your pods can then follow open flow rules and make it out of the network.Run the following command to display the options available with the command
ovn-sbctl:oc exec -n openshift-ovn-kubernetes -it ovnkube-master-mk6p6 \ -c northd -- ovn-sbctl --help
$ oc exec -n openshift-ovn-kubernetes -it ovnkube-master-mk6p6 \ -c northd -- ovn-sbctl --helpCopy to Clipboard Copied! Toggle word wrap Toggle overflow
23.2.6. Command-line arguments for ovn-sbctl to examine southbound database contents Copy linkLink copied to clipboard!
The following table describes the command-line arguments that can be used with ovn-sbctl to examine the contents of the southbound database.
| Argument | Description |
|---|---|
|
| Overview of the southbound database contents. |
|
| List the contents of southbound database for a the specified port . |
|
| List the logical flows. |
23.2.7. OVN-Kubernetes logical architecture Copy linkLink copied to clipboard!
OVN is a network virtualization solution. It creates logical switches and routers. These switches and routers are interconnected to create any network topologies. When you run ovnkube-trace with the log level set to 2 or 5 the OVN-Kubernetes logical components are exposed. The following diagram shows how the routers and switches are connected in OpenShift Container Platform.
Figure 23.2. OVN-Kubernetes router and switch components
The key components involved in packet processing are:
- Gateway routers
-
Gateway routers sometimes called L3 gateway routers, are typically used between the distributed routers and the physical network. Gateway routers including their logical patch ports are bound to a physical location (not distributed), or chassis. The patch ports on this router are known as l3gateway ports in the ovn-southbound database (
ovn-sbdb). - Distributed logical routers
- Distributed logical routers and the logical switches behind them, to which virtual machines and containers attach, effectively reside on each hypervisor.
- Join local switch
- Join local switches are used to connect the distributed router and gateway routers. It reduces the number of IP addresses needed on the distributed router.
- Logical switches with patch ports
- Logical switches with patch ports are used to virtualize the network stack. They connect remote logical ports through tunnels.
- Logical switches with localnet ports
- Logical switches with localnet ports are used to connect OVN to the physical network. They connect remote logical ports by bridging the packets to directly connected physical L2 segments using localnet ports.
- Patch ports
- Patch ports represent connectivity between logical switches and logical routers and between peer logical routers. A single connection has a pair of patch ports at each such point of connectivity, one on each side.
- l3gateway ports
-
l3gateway ports are the port binding entries in the
ovn-sbdbfor logical patch ports used in the gateway routers. They are called l3gateway ports rather than patch ports just to portray the fact that these ports are bound to a chassis just like the gateway router itself. - localnet ports
-
localnet ports are present on the bridged logical switches that allows a connection to a locally accessible network from each
ovn-controllerinstance. This helps model the direct connectivity to the physical network from the logical switches. A logical switch can only have a single localnet port attached to it.
23.2.7.1. Installing network-tools on local host Copy linkLink copied to clipboard!
Install network-tools on your local host to make a collection of tools available for debugging OpenShift Container Platform cluster network issues.
Procedure
Clone the
network-toolsrepository onto your workstation with the following command:git clone git@github.com:openshift/network-tools.git
$ git clone git@github.com:openshift/network-tools.gitCopy to Clipboard Copied! Toggle word wrap Toggle overflow Change into the directory for the repository you just cloned:
cd network-tools
$ cd network-toolsCopy to Clipboard Copied! Toggle word wrap Toggle overflow Optional: List all available commands:
./debug-scripts/network-tools -h
$ ./debug-scripts/network-tools -hCopy to Clipboard Copied! Toggle word wrap Toggle overflow
23.2.7.2. Running network-tools Copy linkLink copied to clipboard!
Get information about the logical switches and routers by running network-tools.
Prerequisites
-
You installed the OpenShift CLI (
oc). -
You are logged in to the cluster as a user with
cluster-adminprivileges. -
You have installed
network-toolson local host.
Procedure
List the routers by running the following command:
./debug-scripts/network-tools ovn-db-run-command ovn-nbctl lr-list
$ ./debug-scripts/network-tools ovn-db-run-command ovn-nbctl lr-listCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow List the localnet ports by running the following command:
./debug-scripts/network-tools ovn-db-run-command \ ovn-sbctl find Port_Binding type=localnet
$ ./debug-scripts/network-tools ovn-db-run-command \ ovn-sbctl find Port_Binding type=localnetCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow List the
l3gatewayports by running the following command:./debug-scripts/network-tools ovn-db-run-command \ ovn-sbctl find Port_Binding type=l3gateway
$ ./debug-scripts/network-tools ovn-db-run-command \ ovn-sbctl find Port_Binding type=l3gatewayCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow List the patch ports by running the following command:
./debug-scripts/network-tools ovn-db-run-command \ ovn-sbctl find Port_Binding type=patch
$ ./debug-scripts/network-tools ovn-db-run-command \ ovn-sbctl find Port_Binding type=patchCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
23.3. Troubleshooting OVN-Kubernetes Copy linkLink copied to clipboard!
OVN-Kubernetes has many sources of built-in health checks and logs.
23.3.1. Monitoring OVN-Kubernetes health by using readiness probes Copy linkLink copied to clipboard!
The ovnkube-master and ovnkube-node pods have containers configured with readiness probes.
Prerequisites
-
Access to the OpenShift CLI (
oc). -
You have access to the cluster with
cluster-adminprivileges. -
You have installed
jq.
Procedure
Review the details of the
ovnkube-masterreadiness probe by running the following command:oc get pods -n openshift-ovn-kubernetes -l app=ovnkube-master \ -o json | jq '.items[0].spec.containers[] | .name,.readinessProbe'
$ oc get pods -n openshift-ovn-kubernetes -l app=ovnkube-master \ -o json | jq '.items[0].spec.containers[] | .name,.readinessProbe'Copy to Clipboard Copied! Toggle word wrap Toggle overflow The readiness probe for the northbound and southbound database containers in the
ovnkube-masterpod checks for the health of the Raft cluster hosting the databases.Review the details of the
ovnkube-nodereadiness probe by running the following command:oc get pods -n openshift-ovn-kubernetes -l app=ovnkube-master \ -o json | jq '.items[0].spec.containers[] | .name,.readinessProbe'
$ oc get pods -n openshift-ovn-kubernetes -l app=ovnkube-master \ -o json | jq '.items[0].spec.containers[] | .name,.readinessProbe'Copy to Clipboard Copied! Toggle word wrap Toggle overflow The
ovnkube-nodecontainer in theovnkube-nodepod has a readiness probe to verify the presence of the ovn-kubernetes CNI configuration file, the absence of which would indicate that the pod is not running or is not ready to accept requests to configure pods.Show all events including the probe failures, for the namespace by using the following command:
oc get events -n openshift-ovn-kubernetes
$ oc get events -n openshift-ovn-kubernetesCopy to Clipboard Copied! Toggle word wrap Toggle overflow Show the events for just this pod:
oc describe pod ovnkube-master-tp2z8 -n openshift-ovn-kubernetes
$ oc describe pod ovnkube-master-tp2z8 -n openshift-ovn-kubernetesCopy to Clipboard Copied! Toggle word wrap Toggle overflow Show the messages and statuses from the cluster network operator:
oc get co/network -o json | jq '.status.conditions[]'
$ oc get co/network -o json | jq '.status.conditions[]'Copy to Clipboard Copied! Toggle word wrap Toggle overflow Show the
readystatus of each container inovnkube-masterpods by running the following script:for p in $(oc get pods --selector app=ovnkube-master -n openshift-ovn-kubernetes \ -o jsonpath='{range.items[*]}{" "}{.metadata.name}'); do echo === $p ===; \ oc get pods -n openshift-ovn-kubernetes $p -o json | jq '.status.containerStatuses[] | .name, .ready'; \ done$ for p in $(oc get pods --selector app=ovnkube-master -n openshift-ovn-kubernetes \ -o jsonpath='{range.items[*]}{" "}{.metadata.name}'); do echo === $p ===; \ oc get pods -n openshift-ovn-kubernetes $p -o json | jq '.status.containerStatuses[] | .name, .ready'; \ doneCopy to Clipboard Copied! Toggle word wrap Toggle overflow NoteThe expectation is all container statuses are reporting as
true. Failure of a readiness probe sets the status tofalse.
23.3.2. Viewing OVN-Kubernetes alerts in the console Copy linkLink copied to clipboard!
The Alerting UI provides detailed information about alerts and their governing alerting rules and silences.
Prerequisites
- You have access to the cluster as a developer or as a user with view permissions for the project that you are viewing metrics for.
Procedure (UI)
- In the Administrator perspective, select Observe → Alerting. The three main pages in the Alerting UI in this perspective are the Alerts, Silences, and Alerting Rules pages.
- View the rules for OVN-Kubernetes alerts by selecting Observe → Alerting → Alerting Rules.
23.3.3. Viewing OVN-Kubernetes alerts in the CLI Copy linkLink copied to clipboard!
You can get information about alerts and their governing alerting rules and silences from the command line.
Prerequisites
-
Access to the cluster as a user with the
cluster-adminrole. -
The OpenShift CLI (
oc) installed. -
You have installed
jq.
Procedure
View active or firing alerts by running the following commands.
Set the alert manager route environment variable by running the following command:
ALERT_MANAGER=$(oc get route alertmanager-main -n openshift-monitoring \ -o jsonpath='{@.spec.host}')$ ALERT_MANAGER=$(oc get route alertmanager-main -n openshift-monitoring \ -o jsonpath='{@.spec.host}')Copy to Clipboard Copied! Toggle word wrap Toggle overflow Issue a
curlrequest to the alert manager route API with the correct authorization details requesting specific fields by running the following command:curl -s -k -H "Authorization: Bearer \ $(oc create token prometheus-k8s -n openshift-monitoring)" \ https://$ALERT_MANAGER/api/v1/alerts \ | jq '.data[] | "\(.labels.severity) \(.labels.alertname) \(.labels.pod) \(.labels.container) \(.labels.endpoint) \(.labels.instance)"'
$ curl -s -k -H "Authorization: Bearer \ $(oc create token prometheus-k8s -n openshift-monitoring)" \ https://$ALERT_MANAGER/api/v1/alerts \ | jq '.data[] | "\(.labels.severity) \(.labels.alertname) \(.labels.pod) \(.labels.container) \(.labels.endpoint) \(.labels.instance)"'Copy to Clipboard Copied! Toggle word wrap Toggle overflow
View alerting rules by running the following command:
oc -n openshift-monitoring exec -c prometheus prometheus-k8s-0 -- curl -s 'http://localhost:9090/api/v1/rules' | jq '.data.groups[].rules[] | select(((.name|contains("ovn")) or (.name|contains("OVN")) or (.name|contains("Ovn")) or (.name|contains("North")) or (.name|contains("South"))) and .type=="alerting")'$ oc -n openshift-monitoring exec -c prometheus prometheus-k8s-0 -- curl -s 'http://localhost:9090/api/v1/rules' | jq '.data.groups[].rules[] | select(((.name|contains("ovn")) or (.name|contains("OVN")) or (.name|contains("Ovn")) or (.name|contains("North")) or (.name|contains("South"))) and .type=="alerting")'Copy to Clipboard Copied! Toggle word wrap Toggle overflow
23.3.4. Viewing the OVN-Kubernetes logs using the CLI Copy linkLink copied to clipboard!
You can view the logs for each of the pods in the ovnkube-master and ovnkube-node pods using the OpenShift CLI (oc).
Prerequisites
-
Access to the cluster as a user with the
cluster-adminrole. -
Access to the OpenShift CLI (
oc). -
You have installed
jq.
Procedure
View the log for a specific pod:
oc logs -f <pod_name> -c <container_name> -n <namespace>
$ oc logs -f <pod_name> -c <container_name> -n <namespace>Copy to Clipboard Copied! Toggle word wrap Toggle overflow where:
-f- Optional: Specifies that the output follows what is being written into the logs.
<pod_name>- Specifies the name of the pod.
<container_name>- Optional: Specifies the name of a container. When a pod has more than one container, you must specify the container name.
<namespace>- Specify the namespace the pod is running in.
For example:
oc logs ovnkube-master-7h4q7 -n openshift-ovn-kubernetes
$ oc logs ovnkube-master-7h4q7 -n openshift-ovn-kubernetesCopy to Clipboard Copied! Toggle word wrap Toggle overflow oc logs -f ovnkube-master-7h4q7 -n openshift-ovn-kubernetes -c ovn-dbchecker
$ oc logs -f ovnkube-master-7h4q7 -n openshift-ovn-kubernetes -c ovn-dbcheckerCopy to Clipboard Copied! Toggle word wrap Toggle overflow The contents of log files are printed out.
Examine the most recent entries in all the containers in the
ovnkube-masterpods:for p in $(oc get pods --selector app=ovnkube-master -n openshift-ovn-kubernetes \ -o jsonpath='{range.items[*]}{" "}{.metadata.name}'); \ do echo === $p ===; for container in $(oc get pods -n openshift-ovn-kubernetes $p \ -o json | jq -r '.status.containerStatuses[] | .name');do echo ---$container---; \ oc logs -c $container $p -n openshift-ovn-kubernetes --tail=5; done; done$ for p in $(oc get pods --selector app=ovnkube-master -n openshift-ovn-kubernetes \ -o jsonpath='{range.items[*]}{" "}{.metadata.name}'); \ do echo === $p ===; for container in $(oc get pods -n openshift-ovn-kubernetes $p \ -o json | jq -r '.status.containerStatuses[] | .name');do echo ---$container---; \ oc logs -c $container $p -n openshift-ovn-kubernetes --tail=5; done; doneCopy to Clipboard Copied! Toggle word wrap Toggle overflow View the last 5 lines of every log in every container in an
ovnkube-masterpod using the following command:oc logs -l app=ovnkube-master -n openshift-ovn-kubernetes --all-containers --tail 5
$ oc logs -l app=ovnkube-master -n openshift-ovn-kubernetes --all-containers --tail 5Copy to Clipboard Copied! Toggle word wrap Toggle overflow
23.3.5. Viewing the OVN-Kubernetes logs using the web console Copy linkLink copied to clipboard!
You can view the logs for each of the pods in the ovnkube-master and ovnkube-node pods in the web console.
Prerequisites
-
Access to the OpenShift CLI (
oc).
Procedure
- In the OpenShift Container Platform console, navigate to Workloads → Pods or navigate to the pod through the resource you want to investigate.
-
Select the
openshift-ovn-kubernetesproject from the drop-down menu. - Click the name of the pod you want to investigate.
-
Click Logs. By default for the
ovnkube-masterthe logs associated with thenorthdcontainer are displayed. - Use the down-down menu to select logs for each container in turn.
23.3.5.1. Changing the OVN-Kubernetes log levels Copy linkLink copied to clipboard!
The default log level for OVN-Kubernetes is 2. To debug OVN-Kubernetes set the log level to 5. Follow this procedure to increase the log level of the OVN-Kubernetes to help you debug an issue.
Prerequisites
-
You have access to the cluster with
cluster-adminprivileges. - You have access to the OpenShift Container Platform web console.
Procedure
Run the following command to get detailed information for all pods in the OVN-Kubernetes project:
oc get po -o wide -n openshift-ovn-kubernetes
$ oc get po -o wide -n openshift-ovn-kubernetesCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create a
ConfigMapfile similar to the following example and use a filename such asenv-overrides.yaml:Example
ConfigMapfileCopy to Clipboard Copied! Toggle word wrap Toggle overflow Apply the
ConfigMapfile by using the following command:oc apply -n openshift-ovn-kubernetes -f env-overrides.yaml
$ oc apply -n openshift-ovn-kubernetes -f env-overrides.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
configmap/env-overrides.yaml created
configmap/env-overrides.yaml createdCopy to Clipboard Copied! Toggle word wrap Toggle overflow Restart the
ovnkubepods to apply the new log level by using the following commands:oc delete pod -n openshift-ovn-kubernetes \ --field-selector spec.nodeName=ip-10-0-217-114.ec2.internal -l app=ovnkube-node
$ oc delete pod -n openshift-ovn-kubernetes \ --field-selector spec.nodeName=ip-10-0-217-114.ec2.internal -l app=ovnkube-nodeCopy to Clipboard Copied! Toggle word wrap Toggle overflow oc delete pod -n openshift-ovn-kubernetes \ --field-selector spec.nodeName=ip-10-0-209-180.ec2.internal -l app=ovnkube-node
$ oc delete pod -n openshift-ovn-kubernetes \ --field-selector spec.nodeName=ip-10-0-209-180.ec2.internal -l app=ovnkube-nodeCopy to Clipboard Copied! Toggle word wrap Toggle overflow oc delete pod -n openshift-ovn-kubernetes -l app=ovnkube-master
$ oc delete pod -n openshift-ovn-kubernetes -l app=ovnkube-masterCopy to Clipboard Copied! Toggle word wrap Toggle overflow
23.3.6. Checking the OVN-Kubernetes pod network connectivity Copy linkLink copied to clipboard!
The connectivity check controller, in OpenShift Container Platform 4.10 and later, orchestrates connection verification checks in your cluster. These include Kubernetes API, OpenShift API and individual nodes. The results for the connection tests are stored in PodNetworkConnectivity objects in the openshift-network-diagnostics namespace. Connection tests are performed every minute in parallel.
Prerequisites
-
Access to the OpenShift CLI (
oc). -
Access to the cluster as a user with the
cluster-adminrole. -
You have installed
jq.
Procedure
To list the current
PodNetworkConnectivityCheckobjects, enter the following command:oc get podnetworkconnectivitychecks -n openshift-network-diagnostics
$ oc get podnetworkconnectivitychecks -n openshift-network-diagnosticsCopy to Clipboard Copied! Toggle word wrap Toggle overflow View the most recent success for each connection object by using the following command:
oc get podnetworkconnectivitychecks -n openshift-network-diagnostics \ -o json | jq '.items[]| .spec.targetEndpoint,.status.successes[0]'
$ oc get podnetworkconnectivitychecks -n openshift-network-diagnostics \ -o json | jq '.items[]| .spec.targetEndpoint,.status.successes[0]'Copy to Clipboard Copied! Toggle word wrap Toggle overflow View the most recent failures for each connection object by using the following command:
oc get podnetworkconnectivitychecks -n openshift-network-diagnostics \ -o json | jq '.items[]| .spec.targetEndpoint,.status.failures[0]'
$ oc get podnetworkconnectivitychecks -n openshift-network-diagnostics \ -o json | jq '.items[]| .spec.targetEndpoint,.status.failures[0]'Copy to Clipboard Copied! Toggle word wrap Toggle overflow View the most recent outages for each connection object by using the following command:
oc get podnetworkconnectivitychecks -n openshift-network-diagnostics \ -o json | jq '.items[]| .spec.targetEndpoint,.status.outages[0]'
$ oc get podnetworkconnectivitychecks -n openshift-network-diagnostics \ -o json | jq '.items[]| .spec.targetEndpoint,.status.outages[0]'Copy to Clipboard Copied! Toggle word wrap Toggle overflow The connectivity check controller also logs metrics from these checks into Prometheus.
View all the metrics by running the following command:
oc exec prometheus-k8s-0 -n openshift-monitoring -- \ promtool query instant http://localhost:9090 \ '{component="openshift-network-diagnostics"}'$ oc exec prometheus-k8s-0 -n openshift-monitoring -- \ promtool query instant http://localhost:9090 \ '{component="openshift-network-diagnostics"}'Copy to Clipboard Copied! Toggle word wrap Toggle overflow View the latency between the source pod and the openshift api service for the last 5 minutes:
oc exec prometheus-k8s-0 -n openshift-monitoring -- \ promtool query instant http://localhost:9090 \ '{component="openshift-network-diagnostics"}'$ oc exec prometheus-k8s-0 -n openshift-monitoring -- \ promtool query instant http://localhost:9090 \ '{component="openshift-network-diagnostics"}'Copy to Clipboard Copied! Toggle word wrap Toggle overflow
23.4. Tracing Openflow with ovnkube-trace Copy linkLink copied to clipboard!
OVN and OVS traffic flows can be simulated in a single utility called ovnkube-trace. The ovnkube-trace utility runs ovn-trace, ovs-appctl ofproto/trace and ovn-detrace and correlates that information in a single output.
You can execute the ovnkube-trace binary from a dedicated container. For releases after OpenShift Container Platform 4.7, you can also copy the binary to a local host and execute it from that host.
The binaries in the Quay images do not currently work for Dual IP stack or IPv6 only environments. For those environments, you must build from source.
23.4.1. Installing the ovnkube-trace on local host Copy linkLink copied to clipboard!
The ovnkube-trace tool traces packet simulations for arbitrary UDP or TCP traffic between points in an OVN-Kubernetes driven OpenShift Container Platform cluster. Copy the ovnkube-trace binary to your local host making it available to run against the cluster.
Prerequisites
-
You installed the OpenShift CLI (
oc). -
You are logged in to the cluster with a user with
cluster-adminprivileges.
Procedure
Create a pod variable by using the following command:
POD=$(oc get pods -n openshift-ovn-kubernetes -l app=ovnkube-master -o name | head -1 | awk -F '/' '{print $NF}')$ POD=$(oc get pods -n openshift-ovn-kubernetes -l app=ovnkube-master -o name | head -1 | awk -F '/' '{print $NF}')Copy to Clipboard Copied! Toggle word wrap Toggle overflow Run the following command on your local host to copy the binary from the
ovnkube-masterpods:oc cp -n openshift-ovn-kubernetes $POD:/usr/bin/ovnkube-trace ovnkube-trace
$ oc cp -n openshift-ovn-kubernetes $POD:/usr/bin/ovnkube-trace ovnkube-traceCopy to Clipboard Copied!