5.10. RDMA 연결 확인


특히 레거시 단일 루트 I/O 가상화(SR-IOV) 이더넷의 경우 시스템 간에 원격 직접 메모리 액세스(RDMA) 연결이 작동하는지 확인합니다.

프로세스

  1. 다음 명령을 사용하여 각 rdma-workload-client 포드에 연결합니다.

    $ oc rsh -n default rdma-sriov-32-workload
    Copy to Clipboard Toggle word wrap

    출력 예

    sh-5.1#
    Copy to Clipboard Toggle word wrap

  2. 다음 명령을 사용하여 첫 번째 워크로드 포드에 할당된 IP 주소를 확인하세요. 이 예에서 첫 번째 워크로드 포드는 RDMA 테스트 서버입니다.

    sh-5.1# ip a
    Copy to Clipboard Toggle word wrap

    출력 예

    1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
        link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
        inet 127.0.0.1/8 scope host lo
           valid_lft forever preferred_lft forever
        inet6 ::1/128 scope host
           valid_lft forever preferred_lft forever
    2: eth0@if3970: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue state UP group default
        link/ether 0a:58:0a:80:02:a7 brd ff:ff:ff:ff:ff:ff link-netnsid 0
        inet 10.128.2.167/23 brd 10.128.3.255 scope global eth0
           valid_lft forever preferred_lft forever
        inet6 fe80::858:aff:fe80:2a7/64 scope link
           valid_lft forever preferred_lft forever
    3843: net1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
        link/ether 26:34:fd:53:a6:ec brd ff:ff:ff:ff:ff:ff
        altname enp55s0f0v5
        inet 192.168.4.225/28 brd 192.168.4.239 scope global net1
           valid_lft forever preferred_lft forever
        inet6 fe80::2434:fdff:fe53:a6ec/64 scope link
           valid_lft forever preferred_lft forever
    sh-5.1#
    Copy to Clipboard Toggle word wrap

    이 포드에 할당된 RDMA 서버의 IP 주소는 net1 인터페이스입니다. 이 예에서 IP 주소는 192.168.4.225 입니다.

  3. ibstatus 명령을 실행하여 각 RDMA 장치 mlx5_x 와 연관된 link_layer 유형(Ethernet 또는 Infiniband)을 가져옵니다. 출력에서는 상태 필드를 확인하여 모든 RDMA 장치의 상태도 보여줍니다. 상태 필드에는 ACTIVE 또는 DOWN 이 표시됩니다.

    sh-5.1# ibstatus
    Copy to Clipboard Toggle word wrap

    출력 예

    Infiniband device 'mlx5_0' port 1 status:
    	default gid:	 0000:0000:0000:0000:0000:0000:0000:0000
    	base lid:	 0x0
    	sm lid:		 0x0
    	state:		 4: ACTIVE
    	phys state:	 5: LinkUp
    	rate:		 200 Gb/sec (4X HDR)
    	link_layer:	 Ethernet
    
    Infiniband device 'mlx5_1' port 1 status:
    	default gid:	 fe80:0000:0000:0000:e8eb:d303:0072:1415
    	base lid:	 0xc
    	sm lid:		 0x1
    	state:		 4: ACTIVE
    	phys state:	 5: LinkUp
    	rate:		 200 Gb/sec (4X HDR)
    	link_layer:	 InfiniBand
    
    Infiniband device 'mlx5_2' port 1 status:
    	default gid:	 0000:0000:0000:0000:0000:0000:0000:0000
    	base lid:	 0x0
    	sm lid:		 0x0
    	state:		 1: DOWN
    	phys state:	 3: Disabled
    	rate:		 200 Gb/sec (4X HDR)
    	link_layer:	 Ethernet
    
    Infiniband device 'mlx5_3' port 1 status:
    	default gid:	 0000:0000:0000:0000:0000:0000:0000:0000
    	base lid:	 0x0
    	sm lid:		 0x0
    	state:		 1: DOWN
    	phys state:	 3: Disabled
    	rate:		 200 Gb/sec (4X HDR)
    	link_layer:	 Ethernet
    
    Infiniband device 'mlx5_4' port 1 status:
    	default gid:	 0000:0000:0000:0000:0000:0000:0000:0000
    	base lid:	 0x0
    	sm lid:		 0x0
    	state:		 1: DOWN
    	phys state:	 3: Disabled
    	rate:		 200 Gb/sec (4X HDR)
    	link_layer:	 Ethernet
    
    Infiniband device 'mlx5_5' port 1 status:
    	default gid:	 0000:0000:0000:0000:0000:0000:0000:0000
    	base lid:	 0x0
    	sm lid:		 0x0
    	state:		 1: DOWN
    	phys state:	 3: Disabled
    	rate:		 200 Gb/sec (4X HDR)
    	link_layer:	 Ethernet
    
    Infiniband device 'mlx5_6' port 1 status:
    	default gid:	 0000:0000:0000:0000:0000:0000:0000:0000
    	base lid:	 0x0
    	sm lid:		 0x0
    	state:		 1: DOWN
    	phys state:	 3: Disabled
    	rate:		 200 Gb/sec (4X HDR)
    	link_layer:	 Ethernet
    
    Infiniband device 'mlx5_7' port 1 status:
    	default gid:	 fe80:0000:0000:0000:2434:fdff:fe53:a6ec
    	base lid:	 0x0
    	sm lid:		 0x0
    	state:		 4: ACTIVE
    	phys state:	 5: LinkUp
    	rate:		 200 Gb/sec (4X HDR)
    	link_layer:	 Ethernet
    
    Infiniband device 'mlx5_8' port 1 status:
    	default gid:	 0000:0000:0000:0000:0000:0000:0000:0000
    	base lid:	 0x0
    	sm lid:		 0x0
    	state:		 1: DOWN
    	phys state:	 3: Disabled
    	rate:		 200 Gb/sec (4X HDR)
    	link_layer:	 Ethernet
    
    Infiniband device 'mlx5_9' port 1 status:
    	default gid:	 0000:0000:0000:0000:0000:0000:0000:0000
    	base lid:	 0x0
    	sm lid:		 0x0
    	state:		 1: DOWN
    	phys state:	 3: Disabled
    	rate:		 200 Gb/sec (4X HDR)
    	link_layer:	 Ethernet
    
    sh-5.1#
    Copy to Clipboard Toggle word wrap

  4. 작업자 노드의 각 RDMA mlx5 장치에 대한 link_layer를 얻으려면 ibstat 명령을 실행하세요.

    sh-5.1# ibstat | egrep "Port|Base|Link"
    Copy to Clipboard Toggle word wrap

    출력 예

    Port 1:
    		Physical state: LinkUp
    		Base lid: 0
    		Port GUID: 0x0000000000000000
    		Link layer: Ethernet
    	Port 1:
    		Physical state: LinkUp
    		Base lid: 12
    		Port GUID: 0xe8ebd30300721415
    		Link layer: InfiniBand
    	Port 1:
    		Base lid: 0
    		Port GUID: 0x0000000000000000
    		Link layer: Ethernet
    	Port 1:
    		Base lid: 0
    		Port GUID: 0x0000000000000000
    		Link layer: Ethernet
    	Port 1:
    		Base lid: 0
    		Port GUID: 0x0000000000000000
    		Link layer: Ethernet
    	Port 1:
    		Base lid: 0
    		Port GUID: 0x0000000000000000
    		Link layer: Ethernet
    	Port 1:
    		Base lid: 0
    		Port GUID: 0x0000000000000000
    		Link layer: Ethernet
    	Port 1:
    		Physical state: LinkUp
    		Base lid: 0
    		Port GUID: 0x2434fdfffe53a6ec
    		Link layer: Ethernet
    	Port 1:
    		Base lid: 0
    		Port GUID: 0x0000000000000000
    		Link layer: Ethernet
    	Port 1:
    		Base lid: 0
    		Port GUID: 0x0000000000000000
    		Link layer: Ethernet
    sh-5.1#
    Copy to Clipboard Toggle word wrap

  5. RDMA 공유 장치 또는 호스트 장치 워크로드 포드의 경우 mlx5_x 라는 RDMA 장치는 이미 알려져 있으며 일반적으로 mlx5_0 또는 mlx5_1 입니다. RDMA 레거시 SR-IOV 워크로드 포드의 경우, 어떤 RDMA 장치가 어떤 가상 기능(VF) 하위 인터페이스와 연결되어 있는지 확인해야 합니다. 다음 명령을 사용하여 이 정보를 제공하세요.

    sh-5.1# rdma link show
    Copy to Clipboard Toggle word wrap

    출력 예

    link mlx5_0/1 state ACTIVE physical_state LINK_UP
    link mlx5_1/1 subnet_prefix fe80:0000:0000:0000 lid 12 sm_lid 1 lmc 0 state ACTIVE physical_state LINK_UP
    link mlx5_2/1 state DOWN physical_state DISABLED
    link mlx5_3/1 state DOWN physical_state DISABLED
    link mlx5_4/1 state DOWN physical_state DISABLED
    link mlx5_5/1 state DOWN physical_state DISABLED
    link mlx5_6/1 state DOWN physical_state DISABLED
    link mlx5_7/1 state ACTIVE physical_state LINK_UP netdev net1
    link mlx5_8/1 state DOWN physical_state DISABLED
    link mlx5_9/1 state DOWN physical_state DISABLED
    Copy to Clipboard Toggle word wrap

    이 예에서 RDMA 장치 이름 mlx5_7net1 인터페이스와 연결됩니다. 이 출력은 다음 명령에서 RDMA 대역폭 테스트를 수행하는 데 사용되며, 이를 통해 작업자 노드 간의 RDMA 연결도 검증됩니다.

  6. 다음 ib_write_bw RDMA 대역폭 테스트 명령을 실행하세요.

    sh-5.1# /root/perftest/ib_write_bw -R -T 41 -s 65536 -F -x 3 -m 4096 --report_gbits -q 16 -D 60  -d mlx5_7 -p 10000 --source_ip  192.168.4.225 --use_cuda=0 --use_cuda_dmabuf
    Copy to Clipboard Toggle word wrap

    다음과 같습니다.

    • mlx5_7 RDMA 장치는 -d 스위치로 전달됩니다.
    • RDMA 서버를 시작하기 위한 소스 IP 주소는 192.168.4.225 입니다.
    • --use_cuda=0 , --use_cuda_dmabuf 스위치는 GPUDirect RDMA를 사용한다는 것을 나타냅니다.

    출력 예

    WARNING: BW peak won't be measured in this run.
    Perftest doesn't supports CUDA tests with inline messages: inline size set to 0
    
    ************************************
    * Waiting for client to connect... *
    ************************************
    Copy to Clipboard Toggle word wrap

  7. 다른 터미널 창을 열고 RDMA 테스트 클라이언트 포드 역할을 하는 두 번째 워크로드 포드에서 oc rsh 명령을 실행합니다.

    $ oc rsh -n default rdma-sriov-33-workload
    Copy to Clipboard Toggle word wrap

    출력 예

    sh-5.1#
    Copy to Clipboard Toggle word wrap

  8. 다음 명령을 사용하여 net1 인터페이스에서 RDMA 테스트 클라이언트 Pod IP 주소를 얻습니다.

    sh-5.1# ip a
    Copy to Clipboard Toggle word wrap

    출력 예

    1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
        link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
        inet 127.0.0.1/8 scope host lo
           valid_lft forever preferred_lft forever
        inet6 ::1/128 scope host
           valid_lft forever preferred_lft forever
    2: eth0@if4139: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc noqueue state UP group default
        link/ether 0a:58:0a:83:01:d5 brd ff:ff:ff:ff:ff:ff link-netnsid 0
        inet 10.131.1.213/23 brd 10.131.1.255 scope global eth0
           valid_lft forever preferred_lft forever
        inet6 fe80::858:aff:fe83:1d5/64 scope link
           valid_lft forever preferred_lft forever
    4076: net1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
        link/ether 56:6c:59:41:ae:4a brd ff:ff:ff:ff:ff:ff
        altname enp55s0f0v0
        inet 192.168.4.226/28 brd 192.168.4.239 scope global net1
           valid_lft forever preferred_lft forever
        inet6 fe80::546c:59ff:fe41:ae4a/64 scope link
           valid_lft forever preferred_lft forever
    sh-5.1#
    Copy to Clipboard Toggle word wrap

  9. 다음 명령을 사용하여 각 RDMA 장치 mlx5_x 와 연관된 link_layer 유형을 가져옵니다.

    sh-5.1# ibstatus
    Copy to Clipboard Toggle word wrap

    출력 예

    Infiniband device 'mlx5_0' port 1 status:
    	default gid:	 0000:0000:0000:0000:0000:0000:0000:0000
    	base lid:	 0x0
    	sm lid:		 0x0
    	state:		 4: ACTIVE
    	phys state:	 5: LinkUp
    	rate:		 200 Gb/sec (4X HDR)
    	link_layer:	 Ethernet
    
    Infiniband device 'mlx5_1' port 1 status:
    	default gid:	 fe80:0000:0000:0000:e8eb:d303:0072:09f5
    	base lid:	 0xd
    	sm lid:		 0x1
    	state:		 4: ACTIVE
    	phys state:	 5: LinkUp
    	rate:		 200 Gb/sec (4X HDR)
    	link_layer:	 InfiniBand
    
    Infiniband device 'mlx5_2' port 1 status:
    	default gid:	 fe80:0000:0000:0000:546c:59ff:fe41:ae4a
    	base lid:	 0x0
    	sm lid:		 0x0
    	state:		 4: ACTIVE
    	phys state:	 5: LinkUp
    	rate:		 200 Gb/sec (4X HDR)
    	link_layer:	 Ethernet
    
    Infiniband device 'mlx5_3' port 1 status:
    	default gid:	 0000:0000:0000:0000:0000:0000:0000:0000
    	base lid:	 0x0
    	sm lid:		 0x0
    	state:		 1: DOWN
    	phys state:	 3: Disabled
    	rate:		 200 Gb/sec (4X HDR)
    	link_layer:	 Ethernet
    
    Infiniband device 'mlx5_4' port 1 status:
    	default gid:	 0000:0000:0000:0000:0000:0000:0000:0000
    	base lid:	 0x0
    	sm lid:		 0x0
    	state:		 1: DOWN
    	phys state:	 3: Disabled
    	rate:		 200 Gb/sec (4X HDR)
    	link_layer:	 Ethernet
    
    Infiniband device 'mlx5_5' port 1 status:
    	default gid:	 0000:0000:0000:0000:0000:0000:0000:0000
    	base lid:	 0x0
    	sm lid:		 0x0
    	state:		 1: DOWN
    	phys state:	 3: Disabled
    	rate:		 200 Gb/sec (4X HDR)
    	link_layer:	 Ethernet
    
    Infiniband device 'mlx5_6' port 1 status:
    	default gid:	 0000:0000:0000:0000:0000:0000:0000:0000
    	base lid:	 0x0
    	sm lid:		 0x0
    	state:		 1: DOWN
    	phys state:	 3: Disabled
    	rate:		 200 Gb/sec (4X HDR)
    	link_layer:	 Ethernet
    
    Infiniband device 'mlx5_7' port 1 status:
    	default gid:	 0000:0000:0000:0000:0000:0000:0000:0000
    	base lid:	 0x0
    	sm lid:		 0x0
    	state:		 1: DOWN
    	phys state:	 3: Disabled
    	rate:		 200 Gb/sec (4X HDR)
    	link_layer:	 Ethernet
    
    Infiniband device 'mlx5_8' port 1 status:
    	default gid:	 0000:0000:0000:0000:0000:0000:0000:0000
    	base lid:	 0x0
    	sm lid:		 0x0
    	state:		 1: DOWN
    	phys state:	 3: Disabled
    	rate:		 200 Gb/sec (4X HDR)
    	link_layer:	 Ethernet
    
    Infiniband device 'mlx5_9' port 1 status:
    	default gid:	 0000:0000:0000:0000:0000:0000:0000:0000
    	base lid:	 0x0
    	sm lid:		 0x0
    	state:		 1: DOWN
    	phys state:	 3: Disabled
    	rate:		 200 Gb/sec (4X HDR)
    	link_layer:	 Ethernet
    Copy to Clipboard Toggle word wrap

  10. 선택 사항: ibstat 명령을 사용하여 Mellanox 카드의 펌웨어 버전을 가져옵니다.

    sh-5.1# ibstat
    Copy to Clipboard Toggle word wrap

    출력 예

    CA 'mlx5_0'
    	CA type: MT4123
    	Number of ports: 1
    	Firmware version: 20.43.1014
    	Hardware version: 0
    	Node GUID: 0xe8ebd303007209f4
    	System image GUID: 0xe8ebd303007209f4
    	Port 1:
    		State: Active
    		Physical state: LinkUp
    		Rate: 200
    		Base lid: 0
    		LMC: 0
    		SM lid: 0
    		Capability mask: 0x00010000
    		Port GUID: 0x0000000000000000
    		Link layer: Ethernet
    CA 'mlx5_1'
    	CA type: MT4123
    	Number of ports: 1
    	Firmware version: 20.43.1014
    	Hardware version: 0
    	Node GUID: 0xe8ebd303007209f5
    	System image GUID: 0xe8ebd303007209f4
    	Port 1:
    		State: Active
    		Physical state: LinkUp
    		Rate: 200
    		Base lid: 13
    		LMC: 0
    		SM lid: 1
    		Capability mask: 0xa651e848
    		Port GUID: 0xe8ebd303007209f5
    		Link layer: InfiniBand
    CA 'mlx5_2'
    	CA type: MT4124
    	Number of ports: 1
    	Firmware version: 20.43.1014
    	Hardware version: 0
    	Node GUID: 0x566c59fffe41ae4a
    	System image GUID: 0xe8ebd303007209f4
    	Port 1:
    		State: Active
    		Physical state: LinkUp
    		Rate: 200
    		Base lid: 0
    		LMC: 0
    		SM lid: 0
    		Capability mask: 0x00010000
    		Port GUID: 0x546c59fffe41ae4a
    		Link layer: Ethernet
    CA 'mlx5_3'
    	CA type: MT4124
    	Number of ports: 1
    	Firmware version: 20.43.1014
    	Hardware version: 0
    	Node GUID: 0xb2ae4bfffe8f3d02
    	System image GUID: 0xe8ebd303007209f4
    	Port 1:
    		State: Down
    		Physical state: Disabled
    		Rate: 200
    		Base lid: 0
    		LMC: 0
    		SM lid: 0
    		Capability mask: 0x00010000
    		Port GUID: 0x0000000000000000
    		Link layer: Ethernet
    CA 'mlx5_4'
    	CA type: MT4124
    	Number of ports: 1
    	Firmware version: 20.43.1014
    	Hardware version: 0
    	Node GUID: 0x2a9967fffe8bf272
    	System image GUID: 0xe8ebd303007209f4
    	Port 1:
    		State: Down
    		Physical state: Disabled
    		Rate: 200
    		Base lid: 0
    		LMC: 0
    		SM lid: 0
    		Capability mask: 0x00010000
    		Port GUID: 0x0000000000000000
    		Link layer: Ethernet
    CA 'mlx5_5'
    	CA type: MT4124
    	Number of ports: 1
    	Firmware version: 20.43.1014
    	Hardware version: 0
    	Node GUID: 0x5aff2ffffe2e17e8
    	System image GUID: 0xe8ebd303007209f4
    	Port 1:
    		State: Down
    		Physical state: Disabled
    		Rate: 200
    		Base lid: 0
    		LMC: 0
    		SM lid: 0
    		Capability mask: 0x00010000
    		Port GUID: 0x0000000000000000
    		Link layer: Ethernet
    CA 'mlx5_6'
    	CA type: MT4124
    	Number of ports: 1
    	Firmware version: 20.43.1014
    	Hardware version: 0
    	Node GUID: 0x121bf1fffe074419
    	System image GUID: 0xe8ebd303007209f4
    	Port 1:
    		State: Down
    		Physical state: Disabled
    		Rate: 200
    		Base lid: 0
    		LMC: 0
    		SM lid: 0
    		Capability mask: 0x00010000
    		Port GUID: 0x0000000000000000
    		Link layer: Ethernet
    CA 'mlx5_7'
    	CA type: MT4124
    	Number of ports: 1
    	Firmware version: 20.43.1014
    	Hardware version: 0
    	Node GUID: 0xb22b16fffed03dd7
    	System image GUID: 0xe8ebd303007209f4
    	Port 1:
    		State: Down
    		Physical state: Disabled
    		Rate: 200
    		Base lid: 0
    		LMC: 0
    		SM lid: 0
    		Capability mask: 0x00010000
    		Port GUID: 0x0000000000000000
    		Link layer: Ethernet
    CA 'mlx5_8'
    	CA type: MT4124
    	Number of ports: 1
    	Firmware version: 20.43.1014
    	Hardware version: 0
    	Node GUID: 0x523800fffe16d105
    	System image GUID: 0xe8ebd303007209f4
    	Port 1:
    		State: Down
    		Physical state: Disabled
    		Rate: 200
    		Base lid: 0
    		LMC: 0
    		SM lid: 0
    		Capability mask: 0x00010000
    		Port GUID: 0x0000000000000000
    		Link layer: Ethernet
    CA 'mlx5_9'
    	CA type: MT4124
    	Number of ports: 1
    	Firmware version: 20.43.1014
    	Hardware version: 0
    	Node GUID: 0xd2b4a1fffebdc4a9
    	System image GUID: 0xe8ebd303007209f4
    	Port 1:
    		State: Down
    		Physical state: Disabled
    		Rate: 200
    		Base lid: 0
    		LMC: 0
    		SM lid: 0
    		Capability mask: 0x00010000
    		Port GUID: 0x0000000000000000
    		Link layer: Ethernet
    sh-5.1#
    Copy to Clipboard Toggle word wrap

  11. 클라이언트 워크로드 포드가 사용하는 가상 기능 하위 인터페이스와 연결된 RDMA 장치를 확인하려면 다음 명령을 실행하세요. 이 예에서 net1 인터페이스는 RDMA 장치 mlx5_2를 사용하고 있습니다.

    sh-5.1# rdma link show
    Copy to Clipboard Toggle word wrap

    출력 예

    link mlx5_0/1 state ACTIVE physical_state LINK_UP
    link mlx5_1/1 subnet_prefix fe80:0000:0000:0000 lid 13 sm_lid 1 lmc 0 state ACTIVE physical_state LINK_UP
    link mlx5_2/1 state ACTIVE physical_state LINK_UP netdev net1
    link mlx5_3/1 state DOWN physical_state DISABLED
    link mlx5_4/1 state DOWN physical_state DISABLED
    link mlx5_5/1 state DOWN physical_state DISABLED
    link mlx5_6/1 state DOWN physical_state DISABLED
    link mlx5_7/1 state DOWN physical_state DISABLED
    link mlx5_8/1 state DOWN physical_state DISABLED
    link mlx5_9/1 state DOWN physical_state DISABLED
    sh-5.1#
    Copy to Clipboard Toggle word wrap

  12. 다음 ib_write_bw RDMA 대역폭 테스트 명령을 실행하세요.

    sh-5.1# /root/perftest/ib_write_bw -R -T 41 -s 65536 -F -x 3 -m 4096 --report_gbits -q 16 -D 60  -d mlx5_2 -p 10000 --source_ip  192.168.4.226 --use_cuda=0 --use_cuda_dmabuf 192.168.4.225
    Copy to Clipboard Toggle word wrap

    다음과 같습니다.

    • mlx5_2 RDMA 장치는 -d 스위치로 전달됩니다.
    • RDMA 서버의 소스 IP 주소는 192.168.4.226 이고 대상 IP 주소는 192.168.4.225입니다 .
    • --use_cuda=0 , --use_cuda_dmabuf 스위치는 GPUDirect RDMA를 사용한다는 것을 나타냅니다.

      출력 예

      WARNING: BW peak won't be measured in this run.
      Perftest doesn't supports CUDA tests with inline messages: inline size set to 0
      Requested mtu is higher than active mtu
      Changing to active mtu - 3
      initializing CUDA
      Listing all CUDA devices in system:
      CUDA device 0: PCIe address is 61:00
      
      Picking device No. 0
      [pid = 8909, dev = 0] device name = [NVIDIA A40]
      creating CUDA Ctx
      making it the current CUDA Ctx
      CUDA device integrated: 0
      using DMA-BUF for GPU buffer address at 0x7f8738600000 aligned at 0x7f8738600000 with aligned size 2097152
      allocated GPU buffer of a 2097152 address at 0x23a7420 for type CUDA_MEM_DEVICE
      Calling ibv_reg_dmabuf_mr(offset=0, size=2097152, addr=0x7f8738600000, fd=40) for QP #0
      ---------------------------------------------------------------------------------------
                          RDMA_Write BW Test
       Dual-port       : OFF		Device         : mlx5_2
       Number of qps   : 16		Transport type : IB
       Connection type : RC		Using SRQ      : OFF
       PCIe relax order: ON		Lock-free      : OFF
       ibv_wr* API     : ON		Using DDP      : OFF
       TX depth        : 128
       CQ Moderation   : 1
       CQE Poll Batch  : 16
       Mtu             : 1024[B]
       Link type       : Ethernet
       GID index       : 3
       Max inline data : 0[B]
       rdma_cm QPs	 : ON
       Data ex. method : rdma_cm 	TOS    : 41
      ---------------------------------------------------------------------------------------
       local address: LID 0000 QPN 0x012d PSN 0x3cb6d7
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:226
       local address: LID 0000 QPN 0x012e PSN 0x90e0ac
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:226
       local address: LID 0000 QPN 0x012f PSN 0x153f50
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:226
       local address: LID 0000 QPN 0x0130 PSN 0x5e0128
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:226
       local address: LID 0000 QPN 0x0131 PSN 0xd89752
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:226
       local address: LID 0000 QPN 0x0132 PSN 0xe5fc16
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:226
       local address: LID 0000 QPN 0x0133 PSN 0x236787
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:226
       local address: LID 0000 QPN 0x0134 PSN 0xd9273e
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:226
       local address: LID 0000 QPN 0x0135 PSN 0x37cfd4
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:226
       local address: LID 0000 QPN 0x0136 PSN 0x3bff8f
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:226
       local address: LID 0000 QPN 0x0137 PSN 0x81f2bd
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:226
       local address: LID 0000 QPN 0x0138 PSN 0x575c43
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:226
       local address: LID 0000 QPN 0x0139 PSN 0x6cf53d
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:226
       local address: LID 0000 QPN 0x013a PSN 0xcaaf6f
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:226
       local address: LID 0000 QPN 0x013b PSN 0x346437
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:226
       local address: LID 0000 QPN 0x013c PSN 0xcc5865
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:226
       remote address: LID 0000 QPN 0x026d PSN 0x359409
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:225
       remote address: LID 0000 QPN 0x026e PSN 0xe387bf
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:225
       remote address: LID 0000 QPN 0x026f PSN 0x5be79d
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:225
       remote address: LID 0000 QPN 0x0270 PSN 0x1b4b28
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:225
       remote address: LID 0000 QPN 0x0271 PSN 0x76a61b
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:225
       remote address: LID 0000 QPN 0x0272 PSN 0x3d50e1
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:225
       remote address: LID 0000 QPN 0x0273 PSN 0x1b572c
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:225
       remote address: LID 0000 QPN 0x0274 PSN 0x4ae1b5
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:225
       remote address: LID 0000 QPN 0x0275 PSN 0x5591b5
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:225
       remote address: LID 0000 QPN 0x0276 PSN 0xfa2593
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:225
       remote address: LID 0000 QPN 0x0277 PSN 0xd9473b
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:225
       remote address: LID 0000 QPN 0x0278 PSN 0x2116b2
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:225
       remote address: LID 0000 QPN 0x0279 PSN 0x9b83b6
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:225
       remote address: LID 0000 QPN 0x027a PSN 0xa0822b
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:225
       remote address: LID 0000 QPN 0x027b PSN 0x6d930d
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:225
       remote address: LID 0000 QPN 0x027c PSN 0xb1a4d
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:225
      ---------------------------------------------------------------------------------------
       #bytes     #iterations    BW peak[Gb/sec]    BW average[Gb/sec]   MsgRate[Mpps]
       65536      10329004         0.00               180.47 		     0.344228
      ---------------------------------------------------------------------------------------
      deallocating GPU buffer 00007f8738600000
      destroying current CUDA Ctx
      sh-5.1#
      Copy to Clipboard Toggle word wrap

      긍정적인 테스트는 예상되는 BW 평균과 MsgRate가 Mpps 단위로 나타나는 것을 의미합니다.

      ib_write_bw 명령이 완료되면 서버 측 출력도 서버 포드에 나타납니다. 다음 예제를 참조하십시오.

      출력 예

      WARNING: BW peak won't be measured in this run.
      Perftest doesn't supports CUDA tests with inline messages: inline size set to 0
      
      ************************************
      * Waiting for client to connect... *
      ************************************
      Requested mtu is higher than active mtu
      Changing to active mtu - 3
      initializing CUDA
      Listing all CUDA devices in system:
      CUDA device 0: PCIe address is 61:00
      
      Picking device No. 0
      [pid = 9226, dev = 0] device name = [NVIDIA A40]
      creating CUDA Ctx
      making it the current CUDA Ctx
      CUDA device integrated: 0
      using DMA-BUF for GPU buffer address at 0x7f447a600000 aligned at 0x7f447a600000 with aligned size 2097152
      allocated GPU buffer of a 2097152 address at 0x2406400 for type CUDA_MEM_DEVICE
      Calling ibv_reg_dmabuf_mr(offset=0, size=2097152, addr=0x7f447a600000, fd=40) for QP #0
      ---------------------------------------------------------------------------------------
                          RDMA_Write BW Test
       Dual-port       : OFF		Device         : mlx5_7
       Number of qps   : 16		Transport type : IB
       Connection type : RC		Using SRQ      : OFF
       PCIe relax order: ON		Lock-free      : OFF
       ibv_wr* API     : ON		Using DDP      : OFF
       CQ Moderation   : 1
       CQE Poll Batch  : 16
       Mtu             : 1024[B]
       Link type       : Ethernet
       GID index       : 3
       Max inline data : 0[B]
       rdma_cm QPs	 : ON
       Data ex. method : rdma_cm 	TOS    : 41
      ---------------------------------------------------------------------------------------
       Waiting for client rdma_cm QP to connect
       Please run the same command with the IB/RoCE interface IP
      ---------------------------------------------------------------------------------------
       local address: LID 0000 QPN 0x026d PSN 0x359409
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:225
       local address: LID 0000 QPN 0x026e PSN 0xe387bf
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:225
       local address: LID 0000 QPN 0x026f PSN 0x5be79d
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:225
       local address: LID 0000 QPN 0x0270 PSN 0x1b4b28
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:225
       local address: LID 0000 QPN 0x0271 PSN 0x76a61b
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:225
       local address: LID 0000 QPN 0x0272 PSN 0x3d50e1
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:225
       local address: LID 0000 QPN 0x0273 PSN 0x1b572c
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:225
       local address: LID 0000 QPN 0x0274 PSN 0x4ae1b5
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:225
       local address: LID 0000 QPN 0x0275 PSN 0x5591b5
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:225
       local address: LID 0000 QPN 0x0276 PSN 0xfa2593
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:225
       local address: LID 0000 QPN 0x0277 PSN 0xd9473b
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:225
       local address: LID 0000 QPN 0x0278 PSN 0x2116b2
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:225
       local address: LID 0000 QPN 0x0279 PSN 0x9b83b6
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:225
       local address: LID 0000 QPN 0x027a PSN 0xa0822b
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:225
       local address: LID 0000 QPN 0x027b PSN 0x6d930d
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:225
       local address: LID 0000 QPN 0x027c PSN 0xb1a4d
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:225
       remote address: LID 0000 QPN 0x012d PSN 0x3cb6d7
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:226
       remote address: LID 0000 QPN 0x012e PSN 0x90e0ac
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:226
       remote address: LID 0000 QPN 0x012f PSN 0x153f50
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:226
       remote address: LID 0000 QPN 0x0130 PSN 0x5e0128
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:226
       remote address: LID 0000 QPN 0x0131 PSN 0xd89752
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:226
       remote address: LID 0000 QPN 0x0132 PSN 0xe5fc16
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:226
       remote address: LID 0000 QPN 0x0133 PSN 0x236787
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:226
       remote address: LID 0000 QPN 0x0134 PSN 0xd9273e
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:226
       remote address: LID 0000 QPN 0x0135 PSN 0x37cfd4
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:226
       remote address: LID 0000 QPN 0x0136 PSN 0x3bff8f
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:226
       remote address: LID 0000 QPN 0x0137 PSN 0x81f2bd
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:226
       remote address: LID 0000 QPN 0x0138 PSN 0x575c43
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:226
       remote address: LID 0000 QPN 0x0139 PSN 0x6cf53d
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:226
       remote address: LID 0000 QPN 0x013a PSN 0xcaaf6f
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:226
       remote address: LID 0000 QPN 0x013b PSN 0x346437
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:226
       remote address: LID 0000 QPN 0x013c PSN 0xcc5865
       GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:04:226
      ---------------------------------------------------------------------------------------
       #bytes     #iterations    BW peak[Gb/sec]    BW average[Gb/sec]   MsgRate[Mpps]
       65536      10329004         0.00               180.47 		     0.344228
      ---------------------------------------------------------------------------------------
      deallocating GPU buffer 00007f447a600000
      destroying current CUDA Ctx
      Copy to Clipboard Toggle word wrap

맨 위로 이동
Red Hat logoGithubredditYoutubeTwitter

자세한 정보

평가판, 구매 및 판매

커뮤니티

Red Hat 문서 정보

Red Hat을 사용하는 고객은 신뢰할 수 있는 콘텐츠가 포함된 제품과 서비스를 통해 혁신하고 목표를 달성할 수 있습니다. 최신 업데이트를 확인하세요.

보다 포괄적 수용을 위한 오픈 소스 용어 교체

Red Hat은 코드, 문서, 웹 속성에서 문제가 있는 언어를 교체하기 위해 최선을 다하고 있습니다. 자세한 내용은 다음을 참조하세요.Red Hat 블로그.

Red Hat 소개

Red Hat은 기업이 핵심 데이터 센터에서 네트워크 에지에 이르기까지 플랫폼과 환경 전반에서 더 쉽게 작업할 수 있도록 강화된 솔루션을 제공합니다.

Theme

© 2025 Red Hat