このコンテンツは選択した言語では利用できません。

Chapter 6. BGP routing


6.1. About BGP routing

To integrate BGP with MetalLB and FRR-K8s in OpenShift Container Platform, you can review how FRR-K8s resources model cluster routing. Migrate FRRConfiguration custom resources from metallb-system to openshift-frr-k8s when admins or third parties created them outside the MetalLB Operator.

Important

If you are using the MetalLB Operator and there are existing FRRConfiguration CRs in the metallb-system namespace created by cluster administrators or third-party cluster components other than the MetalLB Operator, you must ensure that they are copied to the openshift-frr-k8s namespace or that those third-party cluster components use the new namespace. For more information, see "Migrating FRR-K8s resources".

6.1.1. About Border Gateway Protocol (BGP) routing

To enable external routing for your cluster, configure Border Gateway Protocol (BGP) using FRRouting (FRR) and the FRR-K8s daemon. You can define routing behavior with the FRRConfiguration custom resource (CR) and ensure compatibility with the MetalLB Operator by using the required namespace and migration approach.

OpenShift Container Platform supports BGP routing through FRR, a free, open source internet routing protocol suite for Linux, UNIX, and similar operating systems. FRR-K8s is a Kubernetes-based daemon set that exposes a subset of the FRR API in a Kubernetes-compliant manner. As a cluster administrator, you can use the FRRConfiguration custom resource to access FRR services.

The following diagram shows a multi-tenancy environment where two namespaces exist on an OpenShift Container Platform node. When the OVN-Kubernetes gateway router sends traffic from a namespace to an external source, the traffic passes through the default virtual routing and forwarding (VRF) instance. BGP advertisement occurs when the FRR or OVN-Kubernetes router establishes a BGP session with the router of the cloud provider. This session ensures the router of the cloud provider knows that the node is the next-hop IP address for reaching the pod or service networks.

Figure 6.1. BGP advertisement without a VPN

Image of BGP advertisement without a VPN

The following diagram shows multiple VRF BGP instances that use VRF lite. This architecture supports only local gateway mode. VRF lite provides network virtualization by using UDNs to isolate pod traffic without incurring the heavy encapsulation typical of Multi-Protocol Label Switching (MPLS) or Ethernet Virtual Private Network (EVPN) protocols. Separate L3 links get mapped to specific VRFs, so independent BGP peering sessions route traffic to the next-hop router. Further, you can deploy this L3 mechanism to multi-cloud deployments to allow specific namespaces to exist over the network.

Figure 6.2. Multiple VRF BGP instances that use VRF lite

Image of multiple VRF BGP instances that use VRF lite

6.1.1.1. Supported platforms

BGP routing is supported on the following infrastructure types:

  • Bare metal

BGP routing requires that you have properly configured BGP for your network provider. Outages or misconfigurations of your network provider might cause disruptions to your cluster network.

6.1.1.2. Considerations for use with the MetalLB Operator

The MetalLB Operator is installed as an add-on to the cluster. Deployment of the MetalLB Operator automatically enables FRR-K8s as an additional routing capability provider and uses the FRR-K8s daemon installed by this feature.

Before upgrading to 4.18, any existing FRRConfiguration in the metallb-system namespace not managed by the MetalLB operator (added by a cluster administrator or any other component) needs to be copied to the openshift-frr-k8s namespace manually, creating the namespace if necessary.

Important

If you are using the MetalLB Operator and there are existing FRRConfiguration CRs in the metallb-system namespace created by cluster administrators or third-party cluster components other than MetalLB Operator, you must:

  • Ensure that these existing FRRConfiguration CRs are copied to the openshift-frr-k8s namespace.
  • Ensure that the third-party cluster components use the new namespace for the FRRConfiguration CRs that they create.

6.1.1.3. Cluster Network Operator configuration

The Cluster Network Operator API exposes the following API field to configure BGP routing:

  • spec.additionalRoutingCapabilities: Enables deployment of the FRR-K8s daemon for the cluster, which can be used independently of route advertisements. When enabled, the FRR-K8s daemon is deployed on all nodes.

6.1.1.4. BGP routing custom resources

The following custom resources are used to configure BGP routing:

FRRConfiguration
This custom resource defines the FRR configuration for the BGP routing. This CR is namespaced.

6.1.2. Configuring the FRRConfiguration CR

To customize routing behavior beyond standard MetalLB capabilities, configure the FRRConfiguration custom resource (CR).

The following reference examples demonstrate how to define specific FRRouting (FRR) parameters to enable advanced services, such as receiving routes:

The routers parameter

You can use the routers parameter to configure multiple routers, one for each Virtual Routing and Forwarding (VRF) resource. For each router, you must define the Autonomous System Number (ASN).

You can also define a list of Border Gateway Protocol (BGP) neighbors to connect to, as in the following example:

Example FRRConfiguration CR

apiVersion: frrk8s.metallb.io/v1beta1
kind: FRRConfiguration
metadata:
  name: test
  namespace: frr-k8s-system
spec:
  bgp:
    routers:
    - asn: 64512
      neighbors:
      - address: 172.30.0.3
        asn: 4200000000
        ebgpMultiHop: true
        port: 180
      - address: 172.18.0.6
        asn: 4200000000
        port: 179
# ...

The toAdvertise parameter

By default, FRR-K8s does not advertise the prefixes configured as part of a router configuration. To advertise the prefixes, you use the toAdvertise parameter.

You can advertise a subset of the prefixes, as in the following example:

Example FRRConfiguration CR

apiVersion: frrk8s.metallb.io/v1beta1
kind: FRRConfiguration
metadata:
  name: test
  namespace: frr-k8s-system
spec:
  bgp:
    routers:
    - asn: 64512
      neighbors:
      - address: 172.30.0.3
        asn: 4200000000
        ebgpMultiHop: true
        port: 180
        toAdvertise:
          allowed:
            prefixes:
            - 192.168.2.0/24
      prefixes:
        - 192.168.2.0/24
        - 192.169.2.0/24
# ...

  • allowed.prefixes: Advertises a subset of prefixes.

The following example shows you how to advertise all of the prefixes:

Example FRRConfiguration CR

apiVersion: frrk8s.metallb.io/v1beta1
kind: FRRConfiguration
metadata:
  name: test
  namespace: frr-k8s-system
spec:
  bgp:
    routers:
    - asn: 64512
      neighbors:
      - address: 172.30.0.3
        asn: 4200000000
        ebgpMultiHop: true
        port: 180
        toAdvertise:
          allowed:
            mode: all
      prefixes:
        - 192.168.2.0/24
        - 192.169.2.0/24
# ...

  • allowed.mode: Advertises all prefixes.
The toReceive parameter

By default, FRR-K8s does not process any prefixes advertised by a neighbor. You can use the toReceive parameter to process such addresses.

You can configure for a subset of the prefixes, as in this example:

Example FRRConfiguration CR

apiVersion: frrk8s.metallb.io/v1beta1
kind: FRRConfiguration
metadata:
  name: test
  namespace: frr-k8s-system
spec:
  bgp:
    routers:
    - asn: 64512
      neighbors:
      - address: 172.18.0.5
          asn: 64512
          port: 179
          toReceive:
            allowed:
              prefixes:
              - prefix: 192.168.1.0/24
              - prefix: 192.169.2.0/24
                ge: 25
                le: 28
# ...

  • prefixes: The prefix is applied if the prefix length is less than or equal to the le prefix length and greater than or equal to the ge prefix length.

The following example configures FRR to handle all the prefixes announced:

Example FRRConfiguration CR

apiVersion: frrk8s.metallb.io/v1beta1
kind: FRRConfiguration
metadata:
  name: test
  namespace: frr-k8s-system
spec:
  bgp:
    routers:
    - asn: 64512
      neighbors:
      - address: 172.18.0.5
          asn: 64512
          port: 179
          toReceive:
            allowed:
              mode: all
# ...

The bgp parameter

You can use the bgp parameter to define various BFD profiles and associate them with a neighbor. In the following example, BFD backs up the BGP session and FRR can detect link failures:

Example FRRConfiguration CR

apiVersion: frrk8s.metallb.io/v1beta1
kind: FRRConfiguration
metadata:
  name: test
  namespace: frr-k8s-system
spec:
  bgp:
    routers:
    - asn: 64512
      neighbors:
      - address: 172.30.0.3
        asn: 64512
        port: 180
        bfdProfile: defaultprofile
    bfdProfiles:
      - name: defaultprofile
# ...

The nodeSelector parameter

By default, FRR-K8s applies the configuration to all nodes where the daemon is running. You can use the nodeSelector parameter to specify the nodes to which you want to apply the configuration. For example:

Example FRRConfiguration CR

apiVersion: frrk8s.metallb.io/v1beta1
kind: FRRConfiguration
metadata:
  name: test
  namespace: frr-k8s-system
spec:
  bgp:
    routers:
    - asn: 64512
  nodeSelector:
    labelSelector:
    foo: "bar"
# ...

The interface parameter

You can use the interface parameter to configure unnumbered BGP peering by using the following example configuration:

Example FRRConfiguration CR

apiVersion: frrk8s.metallb.io/v1beta1
kind: FRRConfiguration
metadata:
  name: test
  namespace: frr-k8s-system
spec:
  bgp:
    bfdProfiles:
    - echoMode: false
      name: simple
      passiveMode: false
    routers:
    - asn: 64512
      neighbors:
      - asn: 64512
        bfdProfile: simple
        disableMP: false
        interface: net10
        port: 179
        toAdvertise:
          allowed:
            mode: filtered
            prefixes:
            - 5.5.5.5/32
        toReceive:
          allowed:
            mode: filtered
      prefixes:
      - 5.5.5.5/32
# ...

  • neighbors.interface: Activates unnumbered BGP peering.
Note

To use the interface parameter, you must establish a point-to-point, layer 2 connection between the two BGP peers. You can use unnumbered BGP peering with IPv4, IPv6, or dual-stack, but you must enable IPv6 RAs (Router Advertisements). Each interface is limited to one BGP connection.

If you use this parameter, you cannot specify a value in the spec.bgp.routers.neighbors.address parameter.

The parameters for the FRRConfiguration custom resource are described in the following table:

Expand
Table 6.1. MetalLB FRRConfiguration custom resource
ParameterTypeDescription

spec.bgp.routers

array

Specifies the routers that FRR is to configure (one per VRF).

spec.bgp.routers.asn

integer

The Autonomous System Number (ASN) to use for the local end of the session.

spec.bgp.routers.id

string

Specifies the ID of the bgp router.

spec.bgp.routers.vrf

string

Specifies the host VRF used to establish sessions from this router.

spec.bgp.routers.neighbors

array

Specifies the neighbors to establish BGP sessions with.

spec.bgp.routers.neighbors.asn

integer

Specifies the ASN to use for the remote end of the session. If you use this parameter, you cannot specify a value in the spec.bgp.routers.neighbors.dynamicASN parameter.

spec.bgp.routers.neighbors.dynamicASN

string

Detects the ASN to use for the remote end of the session without explicitly setting it. Specify internal for a neighbor with the same ASN, or external for a neighbor with a different ASN. If you use this parameter, you cannot specify a value in the spec.bgp.routers.neighbors.asn parameter.

spec.bgp.routers.neighbors.address

string

Specifies the IP address to establish the session with. If you use this parameter, you cannot specify a value in the spec.bgp.routers.neighbors.interface parameter.

spec.bgp.routers.neighbors.interface

string

Specifies the interface name to use when establishing a session. Use this parameter to configure unnumbered BGP peering. There must be a point-to-point, layer 2 connection between the two BGP peers. You can use unnumbered BGP peering with IPv4, IPv6, or dual-stack, but you must enable IPv6 RAs (Router Advertisements). Each interface is limited to one BGP connection.

spec.bgp.routers.neighbors.port

integer

Specifies the port to dial when establishing the session. Defaults to 179.

spec.bgp.routers.neighbors.password

string

Specifies the password to use for establishing the BGP session. Password and PasswordSecret are mutually exclusive.

spec.bgp.routers.neighbors.passwordSecret

string

Specifies the name of the authentication secret for the neighbor. The secret must be of type "kubernetes.io/basic-auth", and in the same namespace as the FRR-K8s daemon. The key "password" stores the password in the secret. Password and PasswordSecret are mutually exclusive.

spec.bgp.routers.neighbors.holdTime

duration

Specifies the requested BGP hold time, per RFC4271. Defaults to 180s.

spec.bgp.routers.neighbors.keepaliveTime

duration

Specifies the requested BGP keepalive time, per RFC4271. Defaults to 60s.

spec.bgp.routers.neighbors.connectTime

duration

Specifies how long BGP waits between connection attempts to a neighbor.

spec.bgp.routers.neighbors.ebgpMultiHop

boolean

Indicates if the BGPPeer is a multi-hop away.

spec.bgp.routers.neighbors.bfdProfile

string

Specifies the name of the BFD Profile to use for the BFD session associated with the BGP session. If not set, the BFD session is not set up.

spec.bgp.routers.neighbors.toAdvertise.allowed

array

Represents the list of prefixes to advertise to a neighbor, and the associated properties.

spec.bgp.routers.neighbors.toAdvertise.allowed.prefixes

string array

Specifies the list of prefixes to advertise to a neighbor. This list must match the prefixes that you define in the router.

spec.bgp.routers.neighbors.toAdvertise.allowed.mode

string

Specifies the mode to use when handling the prefixes. You can set to filtered to allow only the prefixes in the prefixes list. You can set to all to allow all the prefixes configured on the router.

spec.bgp.routers.neighbors.toAdvertise.withLocalPref

array

Specifies the prefixes associated with an advertised local preference. You must specify the prefixes associated with a local preference in the prefixes allowed to be advertised.

spec.bgp.routers.neighbors.toAdvertise.withLocalPref.prefixes

string array

Specifies the prefixes associated with the local preference.

spec.bgp.routers.neighbors.toAdvertise.withLocalPref.localPref

integer

Specifies the local preference associated with the prefixes.

spec.bgp.routers.neighbors.toAdvertise.withCommunity

array

Specifies the prefixes associated with an advertised BGP community. You must include the prefixes associated with a local preference in the list of prefixes that you want to advertise.

spec.bgp.routers.neighbors.toAdvertise.withCommunity.prefixes

string array

Specifies the prefixes associated with the community.

spec.bgp.routers.neighbors.toAdvertise.withCommunity.community

string

Specifies the community associated with the prefixes.

spec.bgp.routers.neighbors.toReceive

array

Specifies the prefixes to receive from a neighbor.

spec.bgp.routers.neighbors.toReceive.allowed

array

Specifies the information that you want to receive from a neighbor.

spec.bgp.routers.neighbors.toReceive.allowed.prefixes

array

Specifies the prefixes allowed from a neighbor.

spec.bgp.routers.neighbors.toReceive.allowed.mode

string

Specifies the mode to use when handling the prefixes. When set to filtered, only the prefixes in the prefixes list are allowed. When set to all, all the prefixes configured on the router are allowed.

spec.bgp.routers.neighbors.disableMP

boolean

Disables MP BGP to prevent it from separating IPv4 and IPv6 route exchanges into distinct BGP sessions.

spec.bgp.routers.prefixes

string array

Specifies all prefixes to advertise from this router instance.

spec.bgp.bfdProfiles

array

Specifies the list of BFD profiles to use when configuring the neighbors.

spec.bgp.bfdProfiles.name

string

The name of the BFD Profile to be referenced in other parts of the configuration.

spec.bgp.bfdProfiles.receiveInterval

integer

Specifies the minimum interval at which this system can receive control packets, in milliseconds. Defaults to 300ms.

spec.bgp.bfdProfiles.transmitInterval

integer

Specifies the minimum transmission interval, excluding jitter, that this system wants to use to send BFD control packets, in milliseconds. Defaults to 300ms.

spec.bgp.bfdProfiles.detectMultiplier

integer

Configures the detection multiplier to determine packet loss. To determine the connection loss-detection timer, multiply the remote transmission interval by this value.

spec.bgp.bfdProfiles.echoInterval

integer

Configures the minimal echo receive transmission-interval that this system can handle, in milliseconds. Defaults to 50ms.

spec.bgp.bfdProfiles.echoMode

boolean

Enables or disables the echo transmission mode. This mode is disabled by default, and not supported on multihop setups.

spec.bgp.bfdProfiles.passiveMode

boolean

Mark session as passive. A passive session does not attempt to start the connection and waits for control packets from peers before it begins replying.

spec.bgp.bfdProfiles.MinimumTtl

integer

For multihop sessions only. Configures the minimum expected TTL for an incoming BFD control packet.

spec.nodeSelector

string

Limits the nodes that attempt to apply this configuration. If specified, only those nodes whose labels match the specified selectors attempt to apply the configuration. If it is not specified, all nodes attempt to apply this configuration.

status

string

Defines the observed state of FRRConfiguration.

6.1.3. Understanding no-overlay mode for layer-3 networks using Border Gateway Protocol (BGP)

You can use no-overlay mode to route layer 3 pod traffic directly over the underlay network with BGP, which reduces encapsulation overhead and improves east-west performance.

No-overlay mode disables the default encapsulation for the default cluster network and uses BGP-learned routes to forward pod traffic across nodes. A cluster can run overlay and no-overlay networks at the same time.

For the default cluster network, no-overlay supports managed and unmanaged routing. With managed routing, OVN-Kubernetes creates a full-mesh BGP fabric between cluster nodes only, so no external BGP routers are required and pod routes are not advertised outside the cluster (intra-cluster traffic only). Managed routing requires nodes to be directly connected at layer 2; it is not suitable for clusters with nodes in different subnets. With unmanaged routing on the default network, you configure external BGP peers and use RouteAdvertisements custom resources (CRs) to advertise pod subnets to your existing BGP infrastructure.

For a primary network defined by a ClusterUserDefinedNetwork CR, no-overlay supports unmanaged routing only. Configure external BGP peers and RouteAdvertisements CRs for the CUDN.

Requirements
  • A bare-metal cluster that uses the OVN-Kubernetes network plugin.
  • Single-node zone interconnect mode enabled for the cluster.
  • BGP routing enabled and FRR-K8s deployed.
  • Layer 3 networks only (the default network or a primary network defined by a ClusterUserDefinedNetwork CR).
Limitations
  • No-overlay mode is not supported for layer 2 networks.
  • EgressIP, EgressService, IPsec, multicast, and multiple external gateways are not supported for no-overlay networks.
  • Switching an existing network between overlay and no-overlay modes is not supported using a ClusterUserDefinedNetwork CR.
Supported gateway modes
  • On the default cluster network, no-overlay is supported in both local gateway (LGW) mode and shared gateway (SGW) mode.
  • On a primary network defined by a ClusterUserDefinedNetwork CR, no-overlay is supported in both LGW and SGW modes.

    Important

    Pods running on a CUDN configured with NoOverlay transport mode cannot establish TCP connections to NodePort services when externalTrafficPolicy is set to Cluster and the backend pod resides on a different node than the one targeted by the request. This issue occurs regardless of whether outbound SNAT is enabled or disabled.

6.2. Enabling BGP routing

To support dynamic route advertisement and integration with external network infrastructure, you can enable Border Gateway Protocol (BGP) routing for your cluster as a cluster administrator.

As a cluster administrator, you can enable OVN-Kubernetes BGP routing support for your cluster.

6.2.1. Enabling Border Gateway Protocol (BGP) routing

To allow external network integration and route advertisement on supported infrastructure, you can enable Border Gateway Protocol (BGP) routing for your cluster by configuring the cluster network to use an FRR-based dynamic routing provider.

As a cluster administrator, you can enable BGP routing support for your cluster on bare-metal infrastructure.

If you are using BGP routing in conjunction with the MetalLB Operator, the necessary BGP routing support is enabled automatically. You do not need to manually enable BGP routing support.

Prerequisites

  • You have installed the OpenShift CLI (oc).
  • You are logged in to the cluster as a user with the cluster-admin role.
  • The cluster is installed on compatible infrastructure.

Procedure

  • To enable a dynamic routing provider, enter the following command:

    $ oc patch Network.operator.openshift.io/cluster --type=merge -p '{
      "spec": {
        "additionalRoutingCapabilities": {
          "providers": ["FRR"]
        }
      }
    }'

6.3. Disabling BGP routing

To stop external route advertisement and restore standard cluster networking behavior, disable OVN-Kubernetes Border Gateway Protocol (BGP) routing.

As a cluster administrator, you can disable OVN-Kubernetes BGP routing support for your cluster.

6.3.1. Disabling Border Gateway Protocol (BGP) routing

Disable Border Gateway Protocol (BGP) routing for your cluster by removing additional routing capabilities from the network configuration.

As a cluster administrator, you can disable BGP routing support for your cluster on bare-metal infrastructure.

Prerequisites

  • You have installed the OpenShift CLI (oc).
  • You are logged in to the cluster as a user with the cluster-admin role.
  • The cluster is installed on compatible infrastructure.

Procedure

  • To disable dynamic routing, enter the following command:

    $ oc patch Network.operator.openshift.io/cluster --type=merge -p '{
      "spec": { "additionalRoutingCapabilities": null }
    }'

6.4. Improve east-west performance by routing pods on the underlay with BGP

To improve east-west performance on bare-metal clusters, configure no-overlay mode with Border Gateway Protocol (BGP) so pod traffic uses underlay routing instead of Geneve encapsulation.

Important

No-overlay mode with BGP is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. Red Hat does not recommend using them in production. These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process.

For more information about the support scope of Red Hat Technology Preview features, see Technology Preview Features Support Scope.

6.4.1. Understanding no-overlay mode for layer-3 networks using Border Gateway Protocol (BGP)

You can use no-overlay mode to route layer 3 pod traffic directly over the underlay network with BGP, which reduces encapsulation overhead and improves east-west performance.

No-overlay mode disables the default encapsulation for the default cluster network and uses BGP-learned routes to forward pod traffic across nodes. A cluster can run overlay and no-overlay networks at the same time.

For the default cluster network, no-overlay supports managed and unmanaged routing. With managed routing, OVN-Kubernetes creates a full-mesh BGP fabric between cluster nodes only, so no external BGP routers are required and pod routes are not advertised outside the cluster (intra-cluster traffic only). Managed routing requires nodes to be directly connected at layer 2; it is not suitable for clusters with nodes in different subnets. With unmanaged routing on the default network, you configure external BGP peers and use RouteAdvertisements custom resources (CRs) to advertise pod subnets to your existing BGP infrastructure.

For a primary network defined by a ClusterUserDefinedNetwork CR, no-overlay supports unmanaged routing only. Configure external BGP peers and RouteAdvertisements CRs for the CUDN.

Requirements
  • A bare-metal cluster that uses the OVN-Kubernetes network plugin.
  • Single-node zone interconnect mode enabled for the cluster.
  • BGP routing enabled and FRR-K8s deployed.
  • Layer 3 networks only (the default network or a primary network defined by a ClusterUserDefinedNetwork CR).
Limitations
  • No-overlay mode is not supported for layer 2 networks.
  • EgressIP, EgressService, IPsec, multicast, and multiple external gateways are not supported for no-overlay networks.
  • Switching an existing network between overlay and no-overlay modes is not supported using a ClusterUserDefinedNetwork CR.
Supported gateway modes
  • On the default cluster network, no-overlay is supported in both local gateway (LGW) mode and shared gateway (SGW) mode.
  • On a primary network defined by a ClusterUserDefinedNetwork CR, no-overlay is supported in both LGW and SGW modes.

    Important

    Pods running on a CUDN configured with NoOverlay transport mode cannot establish TCP connections to NodePort services when externalTrafficPolicy is set to Cluster and the backend pod resides on a different node than the one targeted by the request. This issue occurs regardless of whether outbound SNAT is enabled or disabled.

6.4.2. Plan underlay routing and BGP for your cluster

Plan routing mode, BGP topology, and SNAT behavior so pod subnets are routable and the design scales for your cluster size.

Workflow and timing

No-overlay mode requires BGP to be enabled on the cluster by configuring the Network custom resource (CR) so that the Cluster Network Operator (CNO) applies your settings.

While no-overlay is a Technology Preview feature, enable the TechPreviewNoUpgrade feature set at installation time if you plan to use the feature on the default cluster network from day 0.

Day 0 (installation): If the default cluster network will use no-overlay, supply manifests in the installation manifests/ directory so BGP and no-overlay settings are applied when the cluster is created.

Day 2 (running cluster): You can patch network.operator/cluster to adjust the outboundSNAT field for clusters using unmanaged routing. Primary ClusterUserDefinedNetwork (CUDN) resources that use no-overlay can be created or updated after installation.

Routing mode
On the default cluster network, managed routing creates a full-mesh BGP fabric between nodes and is easier to operate. It requires nodes to be directly connected at layer 2 and is not suitable for nodes in different subnets. Unmanaged routing lets you control external peers and routing policies and is typically used when integrating with external BGP or when nodes are not in the same layer 2 segment.
BGP topology
For managed routing on the default cluster network, the supported topology is full mesh between nodes so that every node peers with every other node. Unmanaged routing uses your own design, for example eBGP to external routers, together with FRRConfiguration and RouteAdvertisements CRs.
Note

Managed routing and a full-mesh BGP topology apply to the default cluster network when you choose that mode. A primary ClusterUserDefinedNetwork with no-overlay requires unmanaged routing in which you plan external BGP peers, create FRRConfiguration and RouteAdvertisements CRs for that network, and set spec.network.noOverlayOptions.outboundSNAT to Disabled.

SNAT behavior
Enable outbound SNAT when pod IPs are not routable on the external network. Disable outbound SNAT when the underlay can route pod IPs directly.
IP address planning
Allocate pod subnets and node subnets so they do not overlap with existing networks and can be advertised to your BGP fabric.

6.4.3. Enable underlay routing for the default cluster network

To steer default network’s east-west traffic over the underlay network instead of Geneve, configure the Cluster Network Operator (CNO) for NoOverlay transport and set the source network address translation (SNAT) and routing. If you use unmanaged routing, apply FRRConfiguration and RouteAdvertisements custom resources (CRs) so your routers exchange pod routes.

Prerequisites

  • You have cluster-admin privileges.
  • Your cluster is installed on bare-metal infrastructure with single-node zone interconnect mode.
  • You have enabled Border Gateway Protocol (BGP) routing for the cluster. See "About BGP routing" and "Enabling BGP routing" for more information.
  • You deployed FRR-K8s on cluster nodes installed with the BGP prerequisite.
Important
  • Choose spec.defaultNetwork.ovnKubernetesConfig.noOverlayConfig.outboundSNAT based on whether pod IPs are routable on your external network. You can set to Enabled when they are not, and Disabled when the underlay can route pod IP addressess directly.
  • For unmanaged mode you must set outboundSNAT to Enabled or cluster deployment will fail.

Procedure

  1. Enable no-overlay for the default network in the Cluster Network Operator (CNO) custom resource (CR).

    1. At installation time (day 0), set the BGP manifest in your installation manifests/ directory to configure the noOverlayConfig object such as in the following managed and unmanaged routing examples.

      Example CNO CR using no-overlay with unmanaged routing

      apiVersion: operator.openshift.io/v1
      kind: Network
      metadata:
        name: cluster
      spec:
        additionalRoutingCapabilities:
          providers:
          - FRR
        defaultNetwork:
          ovnKubernetesConfig:
            routeAdvertisements: Enabled
            transport: NoOverlay
            noOverlayConfig:
              outboundSNAT: Enabled
              routing: Unmanaged
          type: OVNKubernetes

      Example CNO CR using no-overlay with managed routing

      apiVersion: operator.openshift.io/v1
      kind: Network
      metadata:
        name: cluster
      spec:
        additionalRoutingCapabilities:
          providers:
          - FRR
        defaultNetwork:
          ovnKubernetesConfig:
            routeAdvertisements: Enabled
            transport: NoOverlay
            noOverlayConfig:
              outboundSNAT: Enabled
              routing: Managed
              bgpTopology: FullMesh
              asNumber: 64512
          type: OVNKubernetes

      spec.defaultNetwork.ovnKubernetesConfig.noOverlayConfig.bgpTopology
      Specifies FullMesh for a full-mesh BGP fabric between nodes.
      spec.defaultNetwork.ovnKubernetesConfig.noOverlayConfig.asNumber
      Optional: specifies the BGP autonomous system number used in the default VRF. When omitted, 64512 is used.
    2. On a running cluster (day 2), configure the noOverlayConfig object in CNO using the following command:

      $ oc patch network.operator.openshift.io cluster --type merge --patch '{"spec":{"defaultNetwork":{"ovnKubernetesConfig":{"noOverlayConfig":{"outboundSNAT":"Enabled"}}}}}'
    3. For managed routing, proceed to the verification step. Do not create RouteAdvertisements or FRRConfiguration objects. OVN-Kubernetes can reconcile the managed BGP fabric
  2. If you use unmanaged routing, add manifests to the installation manifests/ directory (day 0) for the following custom resources (CRs):

    1. Add the following FRRConfiguration CR:

      apiVersion: frrk8s.metallb.io/v1beta1
      kind: FRRConfiguration
      metadata:
        name: external-bgp
        namespace: openshift-frr-k8s
        labels:
          network: default
      spec:
        bgp:
          routers:
          - asn: 64512
            neighbors:
            - address: 192.168.111.1
              asn: 64512
              disableMP: true
              toReceive:
                allowed:
                  mode: filtered

      Replace spec.bgp.routers[].neighbors[].address, ASN values, and toReceive filters so they match your external BGP design.

    2. Add the following RouteAdvertisements CR:

      apiVersion: k8s.ovn.org/v1
      kind: RouteAdvertisements
      metadata:
        name: default
      spec:
        advertisements:
        - PodNetwork
        frrConfigurationSelector:
          matchLabels:
            network: default
  3. Optional: Alternatively, you can create or apply the following CRs with oc apply (day 2):

    1. Create an FRRConfiguration CR that defines BGP peering toward your external router.

      Example FRRConfiguration CR for unmanaged routing

      apiVersion: frrk8s.metallb.io/v1beta1
      kind: FRRConfiguration
      metadata:
        name: external-bgp
        namespace: openshift-frr-k8s
        labels:
          network: default
      spec:
        bgp:
          routers:
          - asn: 64512
            neighbors:
            - address: 192.168.111.1
              asn: 64512
              disableMP: true
              toReceive:
                allowed:
                  mode: filtered

      Replace spec.bgp.routers[].neighbors[].address, ASN values, and toReceive filters so they match your external BGP design.

    2. Apply the FRRConfiguration CR using the following command:

      $ oc apply -f <frrconfiguration_file>.yaml

      Replace <frrconfiguration_file>.yaml with your manifest file name.

    3. Create a RouteAdvertisements CR that advertises the pod network.

      Example RouteAdvertisements CR for unmanaged routing

      apiVersion: k8s.ovn.org/v1
      kind: RouteAdvertisements
      metadata:
        name: default
      spec:
        advertisements:
        - PodNetwork
        frrConfigurationSelector:
          matchLabels:
            network: default
        networkSelectors:
        - networkSelectionType: DefaultNetwork
        nodeSelector: {}

      Note

      For unmanaged routing on the default network, at least one RouteAdvertisements object must select the default network. In the example, the spec.networkSelectors entry with networkSelectionType: DefaultNetwork selects the default network, spec.advertisements includes PodNetwork, and the RouteAdvertisements CR reaches Accepted=True in status. OVN-Kubernetes uses this configuration when advertising pod subnets to your BGP infrastructure.

    4. Apply the RouteAdvertisements CR using the following command:

      $ oc apply -f <routeadvertisements_file>.yaml

      Replace <routeadvertisements_file>.yaml with your file name.

Verification

  1. Verify that the OVN-Kubernetes pods are running:

    $ oc get pods -n openshift-ovn-kubernetes

6.4.4. Create a ClusterUserDefinedNetwork CR that uses underlay routing

Create a layer 3 ClusterUserDefinedNetwork (CUDN) custom resource (CR) with no-overlay transport and unmanaged routing so pods use BGP routes instead of encapsulation for east-west traffic.

You advertise pod subnets using FRRConfiguration and RouteAdvertisements CRs. For managed routing and a full-mesh BGP fabric between nodes on the default cluster network, see "Enable underlay routing for the default cluster network".

Important
  • On a primary CUDN, NoOverlay mode supports unmanaged routing only. Managed routing (full-mesh BGP between nodes without external peers) is supported on the default cluster network only.
  • On a primary CUDN, NoOverlay transport and outboundSNAT set to Enabled are not supported.

Prerequisites

  • You have cluster-admin privileges.
  • Your cluster is installed on bare-metal infrastructure with single-node zone interconnect mode.
  • You have enabled the NoOverlayMode feature flag in the TechPreviewNoUpgrade feature set.
  • You enabled BGP routing support for the cluster.
  • You deployed FRR-K8s on cluster nodes.

Procedure

  1. Create a ClusterUserDefinedNetwork CR that uses no-overlay transport.

    Note

    For a primary layer 3 ClusterUserDefinedNetwork CR, every namespace that matches spec.namespaceSelector must include the k8s.ovn.org/primary-user-defined-network label before workloads can use the network; that label can only be set when the namespace is created.

    1. Set spec.network.noOverlayOptions.routing to Unmanaged.

      Example ClusterUserDefinedNetwork CR for no-overlay mode with outboundSNAT set to Disabled

      apiVersion: k8s.ovn.org/v1
      kind: ClusterUserDefinedNetwork
      metadata:
        name: high-perf-network
        labels:
          network: high-perf-network
      spec:
        namespaceSelector:
          matchLabels:
            app: performance-sensitive
        network:
          topology: Layer3
          layer3:
            role: Primary
            subnets:
            - cidr: 10.200.0.0/16
              hostSubnet: 24
          transport: "NoOverlay"
          noOverlayOptions:
            outboundSNAT: "Disabled"
            routing: "Unmanaged"

      Important

      Pods running on a CUDN running NoOverlay mode cannot establish TCP connections to NodePort services. This occurs when externalTrafficPolicy is set to Cluster and the backend pod resides on a different node than the one targeted by the request. This issue occurs regardless of whether outbound SNAT is enabled or disabled.

    2. Apply the ClusterUserDefinedNetwork CR by entering the following command:

      $ oc apply -f <cudn_file>.yaml

      Replace <cudn_file>.yaml with the name of your ClusterUserDefinedNetwork CR file.

  2. Create a RouteAdvertisements CR

    1. Set spec.advertisements to PodNetwork to advertise the CUDN pod subnets to your external BGP infrastructure.

      Example RouteAdvertisements CR advertising the CUDN pod subnets

      apiVersion: k8s.ovn.org/v1
      kind: RouteAdvertisements
      metadata:
        name: high-perf-network
      spec:
        nodeSelector: {}
        frrConfigurationSelector:
          matchLabels:
            network: high-perf-network
        networkSelectors:
        - networkSelectionType: ClusterUserDefinedNetworks
          clusterUserDefinedNetworkSelector:
            networkSelector:
              matchLabels:
                network: high-perf-network
        advertisements:
        - PodNetwork

      where:

      spec.nodeSelector
      Specifies which nodes to include in the advertisements; when empty ({}), all nodes are selected.
      spec.frrConfigurationSelector
      Specifies the FRRConfiguration that peers with your external routers. Use matchLabels to select the FRRConfiguration by its labels.
      spec.networkSelectors.networkSelectionType
      Specifies the type of network to advertise. Set to ClusterUserDefinedNetworks to advertise a cluster user-defined network (CUDN). Set to DefaultNetwork to advertise the default cluster network.
      spec.advertisements
      Specifies the type of networks to advertise. Set to PodNetwork to advertise pod subnets. Set to EgressIP to advertise EgressIPs.
    2. Apply the RouteAdvertisements CR by entering the following command:

      $ oc apply -f <routeadvertisements_file>.yaml

      Replace <routeadvertisements_file>.yaml with the name of your RouteAdvertisements CR file.

  3. Verify that the no-overlay transport was accepted by entering the following command:

    $ oc get clusteruserdefinednetwork high-perf-network -o yaml

6.4.5. Reference for Cluster Network Operator (CNO) settings for default network underlay routing

Review the Cluster Network Operator (CNO) custom resource (CR) fields that control BGP, route advertisements, and no-overlay mode for the default OVN-Kubernetes cluster network. No-overlay mode for the default network is configured on the CNO CR.

Example CNO CR with BGP, no-overlay, and managed routing

apiVersion: operator.openshift.io/v1
kind: Network
metadata:
  name: cluster
spec:
  additionalRoutingCapabilities:
    providers:
    - FRR
  defaultNetwork:
    ovnKubernetesConfig:
      routeAdvertisements: Enabled
      transport: NoOverlay
      noOverlayConfig:
        outboundSNAT: Enabled
        routing: Managed
        bgpTopology: FullMesh
        asNumber: 64512
    type: OVNKubernetes

where:

spec.defaultNetwork.type
Must be OVNKubernetes.
spec.additionalRoutingCapabilities.providers
Specifies additional routing components on the cluster. For BGP and route advertisements, include FRR so FRR-K8s is deployed on nodes.
spec.defaultNetwork.ovnKubernetesConfig.routeAdvertisements
Specifies whether the cluster can import and advertise routes as configured by RouteAdvertisements objects and related BGP configuration. Set to Enabled so the cluster can import and advertise routes as configured by RouteAdvertisements objects and related BGP configuration.
spec.defaultNetwork.ovnKubernetesConfig.transport
Specifies the transport protocol. Set to NoOverlay to disable the default encapsulation for the default network and use underlay for routing pod traffic.
spec.defaultNetwork.ovnKubernetesConfig.noOverlayConfig.outboundSNAT
Specifies outbound SNAT for the default network. Set to Enabled or Disabled. When Enabled, outbound pod traffic to networks outside the pod subnet is source NATed to the node IP when required. When Disabled, the underlay must route pod IPs directly. Pod-to-pod traffic within the cluster is not SNATed; traffic to node IPs, the Kubernetes API, and cluster DNS is still SNATed where applicable.
spec.defaultNetwork.ovnKubernetesConfig.noOverlayConfig.routing
Specifies the routing mode for the default network. Set to Managed for a full-mesh BGP fabric between nodes (intra-cluster), or Unmanaged when you provide FRRConfiguration and RouteAdvertisements CRs for external BGP.
spec.defaultNetwork.ovnKubernetesConfig.noOverlayConfig.bgpTopology
Specifies the BGP topology for the default network. When routing is set to Managed, set to FullMesh so every node runs a BGP router and peers with all other nodes.
spec.defaultNetwork.ovnKubernetesConfig.noOverlayConfig.asNumber
Specifies the BGP autonomous system (AS) number for the default VRF. When routing is set to Managed, optional BGP AS number for the default VRF. When omitted, the default is 64512.

6.4.6. Reference for underlay routing settings on a ClusterUserDefinedNetwork custom resource (CR)

Review the full spec paths on a ClusterUserDefinedNetwork custom resource (CR) when you use no-overlay transport on a primary layer 3 cluster user-defined network.

Example ClusterUserDefinedNetwork CR for no-overlay mode (unmanaged routing)

apiVersion: k8s.ovn.org/v1
kind: ClusterUserDefinedNetwork
metadata:
  name: high-perf-network
  labels:
    network: high-perf-network
spec:
  namespaceSelector:
    matchLabels:
      app: performance-sensitive
  network:
    topology: Layer3
    layer3:
      role: Primary
      subnets:
      - cidr: 10.200.0.0/16
        hostSubnet: 24
    transport: "NoOverlay"
    noOverlayOptions:
      outboundSNAT: "Disabled"
      routing: "Unmanaged"

where:

spec.namespaceSelector
Specifies label selectors for namespaces that can attach workloads to the primary network.
spec.network.topology
Specifies the network topology. Must be Layer3.
spec.network.layer3.role
Specifies the role of the network. For a primary CUDN, set to Primary.
spec.network.layer3.subnets
Specifies a list of objects with cidr and hostSubnet fields that define the pod address space and per-node prefix size.
spec.network.transport
Specifies the transport protocol. Set to NoOverlay to disable the default encapsulation for this CUDN and use underlay routing for pod traffic.
spec.network.noOverlayOptions
Specifies routing mode and SNAT behavior for this network. Required when transport is set to NoOverlay.
spec.network.noOverlayOptions.routing
Specifies the routing mode for the network. For a CUDN, set to Unmanaged only. You manage external BGP peers and RouteAdvertisements CR for this network. No-overlay managed routing mode is only supported on the default cluster network. You must configure no-overlay managed routing on the Cluster Network Operator (CNO) CR, not on a CUDN.
spec.network.noOverlayOptions.outboundSNAT
Specifies outbound SNAT for this network. For a primary ClusterUserDefinedNetwork CR with NoOverlay transport, set to Disabled when the underlay routes pod IPs directly. A value of Enabled is not supported on this CR type. To use Enabled or Disabled with no-overlay on the default cluster network, configure spec.defaultNetwork.ovnKubernetesConfig.noOverlayConfig.outboundSNAT on the CNO CR instead.

6.4.7. Troubleshoot connectivity for pods that use underlay routing

To troubleshoot connectivity and resolve no-overlay connectivity issues, you can verify BGP sessions, route advertisements, and network status.

Prerequisites

  • You have cluster-admin privileges.
  • You configured no-overlay mode for the default network or a ClusterUserDefinedNetwork (CUDN) CR.

Procedure

  1. Verify that FRR-K8s pods are running by running the following command:

    $ oc get pods -n openshift-frr-k8s
  2. If you configured no-overlay for the default network, verify the Cluster Network Operator (CNO) CR by running the following command:

    $ oc get network.operator cluster -o yaml

    Confirm that spec.defaultNetwork.ovnKubernetesConfig includes the expected routeAdvertisements, transport, and noOverlayConfig values.

  3. If you use unmanaged routing, verify RouteAdvertisements objects by running the following commands:

    $ oc describe routeadvertisements <routeadvertisements_name>

    Replace <routeadvertisements_name> with the name of your RouteAdvertisements object.

    The output should show Accepted=True in the status section.

    Note

    If you use managed routing, you typically do not create RouteAdvertisements yourself for intra-cluster-only designs; if pod connectivity fails, continue with the remaining steps in this procedure.

  4. If you configured a no-overlay ClusterUserDefinedNetwork CR, check its status by running the following command:

    $ oc get clusteruserdefinednetwork <cudn_name> -o yaml

    Replace <cudn_name> with the name of your CUDN CR.

  5. Check for BGP-related errors in the OVN-Kubernetes logs by entering the following command:

    $ oc logs -n openshift-ovn-kubernetes -l app=ovnkube-node
  6. Confirm that pod subnets are present in the node routing table by entering the following command:

    $ oc debug node/<node_name> -- chroot /host ip route

6.5. Migrating FRR-K8s resources

Migrating FRR-K8s custom resources is required when upgrading from OpenShift Container Platform 4.17 or earlier with the MetalLB Operator deployed. Existing FRRConfiguration resources in the metallb-system namespace must be moved to the openshift-frr-k8s namespace to align with the updated architecture. Learn how to migrate these resources using the CLI and how to verify that the migration completed successfully.

All user-created FRR-K8s custom resources (CRs) in the metallb-system namespace under OpenShift Container Platform 4.17 and earlier releases must be migrated to the openshift-frr-k8s namespace. As a cluster administrator, you can migrate your FRR-K8s custom resources to the openshift-frr-k8s namespace using the CLI.

6.5.1. Migrating FRR-K8s resources

You can migrate the FRR-K8s FRRConfiguration custom resources from the metallb-system namespace to the openshift-frr-k8s namespace.

When upgrading from an earlier version of OpenShift Container Platform with the Metal LB Operator deployed, you must manually migrate your custom FRRConfiguration configurations from the metallb-system namespace to the openshift-frr-k8s namespace.

Prerequisites

  • You have installed the OpenShift CLI (oc).
  • You are logged in to the cluster as a user with the cluster-admin role.

Procedure

  1. To create the openshift-frr-k8s namespace, enter the following command:

    $ oc create namespace openshift-frr-k8s
  2. To automate the migration, create a shell script named migrate.sh with the following contents:

    #!/bin/bash
    OLD_NAMESPACE="metallb-system"
    NEW_NAMESPACE="openshift-frr-k8s"
    FILTER_OUT="metallb-"
    oc get frrconfigurations.frrk8s.metallb.io -n "${OLD_NAMESPACE}" -o json |\
      jq -r '.items[] | select(.metadata.name | test("'"${FILTER_OUT}"'") | not)' |\
      jq -r '.metadata.namespace = "'"${NEW_NAMESPACE}"'"' |\
      oc create -f -
  3. To execute the migration, run the following command:

    $ bash migrate.sh

Verification

  • To confirm that the migration succeeded, run the following command:

    $ oc get frrconfigurations.frrk8s.metallb.io -n openshift-frr-k8s

After the migration is complete, you can remove the FRRConfiguration custom resources from the metallb-system namespace.

Red Hat logoGithubredditYoutubeTwitter

詳細情報

試用、購入および販売

コミュニティー

会社概要

Red Hat は、企業がコアとなるデータセンターからネットワークエッジに至るまで、各種プラットフォームや環境全体で作業を簡素化できるように、強化されたソリューションを提供しています。

多様性を受け入れるオープンソースの強化

Red Hat では、コード、ドキュメント、Web プロパティーにおける配慮に欠ける用語の置き換えに取り組んでいます。このような変更は、段階的に実施される予定です。詳細情報: Red Hat ブログ.

Red Hat ドキュメントについて

Legal Notice

Theme

© 2026 Red Hat
トップに戻る