Este conteúdo não está disponível no idioma selecionado.
13.2. Transferring Data Using RoCE
RDMA over Converged Ethernet (RoCE) is a network protocol that enables remote direct memory access (RDMA) over an Ethernet network. There are two RoCE versions, RoCE v1 and RoCE v2, depending on the network adapter used.
- RoCE v1
- The
RoCE v1
protocol is an Ethernet link layer protocol with ethertype0x8915
that enables communication between any two hosts in the same Ethernet broadcast domain. RoCE v1 is the default version for RDMA Connection Manager (RDMA_CM) when using the ConnectX-3 network adapter. - RoCE v2
- The
RoCE v2
protocol exists on top of either the UDP over IPv4 or the UDP over IPv6 protocol. The UDP destination port number4791
has been reserved for RoCE v2. Since Red Hat Enterprise Linux 7.5, RoCE v2 is the default version for RDMA_CM when using the ConnectX-3 Pro, ConnectX-4, ConnectX-4 Lx and ConnectX-5 network adapters. Hardware supports bothRoCE v1
andRoCE v2
.
RDMA Connection Manager (RDMA_CM) is used to set up a reliable connection between a client and a server for transferring data. RDMA_CM provides an RDMA transport-neutral interface for establishing connections. The communication is over a specific RDMA device, and data transfers are message-based.
Prerequisites
An RDMA_CM session requires one of the following:
- Both client and server support the same RoCE mode.
- A client supports RoCE v1 and a server RoCE v2.
Since a client determines the mode of the connection, the following cases are possible:
- A successful connection:
- If a client is in RoCE v1 or in RoCE v2 mode depending on the network card and the driver used, the corresponding server must have the same version to create a connection. Also, the connection is successful if a client is in RoCE v1 and a server in RoCE v2 mode.
- A failed connection:
- If a client is in RoCE v2 and the corresponding server is in RoCE v1, no connection can be established. In this case, update the driver or the network adapter of the corresponding server, see Section 13.2, “Transferring Data Using RoCE”
Client | Server | Default setting |
---|---|---|
RoCE v1 | RoCE v1 | Connection |
RoCE v1 | RoCE v2 | Connection |
RoCE v2 | RoCE v2 | Connection |
RoCE v2 | RoCE v1 | No connection |
That RoCE v2 on the client and RoCE v1 on the server are not compatible. To resolve this issue, force both the server and client-side environment to communicate over RoCE v1. This means to force hardware that supports RoCE v2 to use RoCE v1:
Procedure 13.1. Changing the Default RoCE Mode When the Hardware Is Already Running in Roce v2
- Change into the
/sys/kernel/config/rdma_cm
directory to et the RoCE mode:~]#
cd /sys/kernel/config/rdma_cm
- Enter the
ibstat
command with an Ethernet network device to display the status. For example, for mlx5_0:~]$
ibstat
mlx5_0 CA 'mlx5_0' CA type: MT4115 Number of ports: 1 Firmware version: 12.17.1010 Hardware version: 0 Node GUID: 0x248a0703004bf0a4 System image GUID: 0x248a0703004bf0a4 Port 1: State: Active Physical state: LinkUp Rate: 40 Base lid: 0 LMC: 0 SM lid: 0 Capability mask: 0x04010000 Port GUID: 0x268a07fffe4bf0a4 Link layer: Ethernet - Create a directory for the mlx5_0 device:
~]#
mkdir
mlx5_0 - Display the RoCE mode in the
default_roce_mode
file in the tree format:~]#
cd
mlx5_0~]$ tree └── ports └── 1 ├── default_roce_mode └── default_roce_tos
~]$
cat /sys/kernel/config/rdma_cm/mlx5_0/ports/1/default_roce_mode
RoCE v2 - Change the default RoCE mode:
~]#
echo "RoCE v1" > /sys/kernel/config/rdma_cm/mlx5_0/ports/1/default_roce_mode
- View the changes:
~]$
cat /sys/kernel/config/rdma_cm/mlx5_0/ports/1/default_roce_mode
RoCE v1