Chapter 5. Configuring an RDMA subsystem for SR-IOV
Remote Direct Memory Access (RDMA) allows direct memory access between two systems without involving the operating system of either system. You can configure an RDMA Container Network Interface (CNI) on Single Root I/O Virtualization (SR-IOV) to enable high-performance, low-latency communication between containers. When you combine RDMA with SR-IOV, you provide a mechanism to expose hardware counters of Mellanox Ethernet devices for use inside Data Plane Development Kit (DPDK) applications.
5.1. Configuring SR-IOV RDMA CNI Copy linkLink copied to clipboard!
Configure an RDMA CNI on SR-IOV.
This procedure applies only to Mellanox devices.
Prerequisites
-
You have installed the OpenShift CLI (
oc). -
You have access to the cluster as a user with the
cluster-adminrole. - You have installed the SR-IOV Network Operator.
Procedure
Create an
SriovNetworkPoolConfigCR and save it assriov-nw-pool.yaml, as shown in the following example:Example
SriovNetworkPoolConfigCRCopy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Set RDMA network namespace mode to
exclusive.
Create the
SriovNetworkPoolConfigresource by running the following command:oc create -f sriov-nw-pool.yaml
$ oc create -f sriov-nw-pool.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Create an
SriovNetworkNodePolicyCR and save it assriov-node-policy.yaml, as shown in the following example:Example
SriovNetworkNodePolicyCRCopy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Activate RDMA mode.
Create the
SriovNetworkNodePolicyresource by running the following command:oc create -f sriov-node-policy.yaml
$ oc create -f sriov-node-policy.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Create an
SriovNetworkCR and save it assriov-network.yaml, as shown in the following example:Example
SriovNetworkCRCopy to Clipboard Copied! Toggle word wrap Toggle overflow - 1
- Create the RDMA plugin.
Create the
SriovNetworkresource by running the following command:oc create -f sriov-network.yaml
$ oc create -f sriov-network.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow
Verification
Create a
PodCR and save it assriov-test-pod.yaml, as shown in the following example:Example runtime configuration
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Create the test pod by running the following command:
oc create -f sriov-test-pod.yaml
$ oc create -f sriov-test-pod.yamlCopy to Clipboard Copied! Toggle word wrap Toggle overflow Log in to the test pod by running the following command:
oc rsh testpod1 -n sriov-tests
$ oc rsh testpod1 -n sriov-testsCopy to Clipboard Copied! Toggle word wrap Toggle overflow Verify that the path to the
hw-countersdirectory exists by running the following command:ls /sys/bus/pci/devices/${PCIDEVICE_OPENSHIFT_IO_SRIOV_NIC_PF1}/infiniband/*/ports/1/hw_counters/$ ls /sys/bus/pci/devices/${PCIDEVICE_OPENSHIFT_IO_SRIOV_NIC_PF1}/infiniband/*/ports/1/hw_counters/Copy to Clipboard Copied! Toggle word wrap Toggle overflow Example output
duplicate_request out_of_buffer req_cqe_flush_error resp_cqe_flush_error roce_adp_retrans roce_slow_restart_trans implied_nak_seq_err out_of_sequence req_remote_access_errors resp_local_length_error roce_adp_retrans_to rx_atomic_requests lifespan packet_seq_err req_remote_invalid_request resp_remote_access_errors roce_slow_restart rx_read_requests local_ack_timeout_err req_cqe_error resp_cqe_error rnr_nak_retry_err roce_slow_restart_cnps rx_write_requests
duplicate_request out_of_buffer req_cqe_flush_error resp_cqe_flush_error roce_adp_retrans roce_slow_restart_trans implied_nak_seq_err out_of_sequence req_remote_access_errors resp_local_length_error roce_adp_retrans_to rx_atomic_requests lifespan packet_seq_err req_remote_invalid_request resp_remote_access_errors roce_slow_restart rx_read_requests local_ack_timeout_err req_cqe_error resp_cqe_error rnr_nak_retry_err roce_slow_restart_cnps rx_write_requestsCopy to Clipboard Copied! Toggle word wrap Toggle overflow