My softroce (on CX4 card) can't work with roce (on CX4 card)

The softroce vertion (rdma-core vertion is 26.0 and perftest vertion is 4.8, both are the latest version in https://github.com/linux-rdma/). And the MLNX-OFED version on hard roce is MLNX_OFED_LINUX-4.5-1.0.1.0-ubuntu16.04-x86_64.tgz. We also tried the MLNX-OFED 4.7 version, it also does’t work.

rping result:

On the hard roce client:

root@train-gpu9:~ # rping -cd -a 192.168.0.20 -p 1234 -C 1

created cm_id 0xdd8d90

cma_event type RDMA_CM_EVENT_ADDR_RESOLVED cma_id 0xdd8d90 (parent)

cma_event type RDMA_CM_EVENT_ROUTE_RESOLVED cma_id 0xdd8d90 (parent)

rdma_resolve_addr - rdma_resolve_route successful

created pd 0xdd84a0

created channel 0xdd8540

created cq 0xddaff0

created qp 0xddd308

rping_setup_buffers called on cb 0xdd53c0

allocated & registered buffers…

cq_thread started.

cma_event type RDMA_CM_EVENT_ESTABLISHED cma_id 0xdd8d90 (parent)

ESTABLISHED

rmda_connect successful

RDMA addr dd60f0 rkey 24fb2 len 64

send completion

recv completion

RDMA addr dd8560 rkey 242a5 len 64

send completion

recv completion

cma_event type RDMA_CM_EVENT_DISCONNECTED cma_id 0xdd8d90 (parent)

client DISCONNECT EVENT…

rping_free_buffers called on cb 0xdd53c0

destroy cm_id 0xdd8d90

11:18 root@train-gpu9:~ #

On the softroce server:

root@train-gpu10:~# rping -sd -a 192.168.0.20 -p 1234 -C 1

port 1234

count 1

created cm_id 0x1760820

rdma_bind_addr successful

rdma_listen

cma_event type RDMA_CM_EVENT_CONNECT_REQUEST cma_id 0x7fe504000a30 (child)

child cma 0x7fe504000a30

created pd 0x17580b0

created channel 0x17580d0

created cq 0x1763070

created qp 0x1763120

rping_setup_buffers called on cb 0x17573c0

allocated & registered buffers…

accepting client connection request

cq_thread started.

cma_event type RDMA_CM_EVENT_ESTABLISHED cma_id 0x7fe504000a30 (child)

ESTABLISHED

recv completion

Received rkey 24fb2 addr dd60f0 len 64 from peer

server received sink adv

server posted rdma read req

rdma read completion

server received read complete

server posted go ahead

send completion

recv completion

Received rkey 242a5 addr dd8560 len 64 from peer

server received sink adv

rdma write from lkey 1258 laddr 17632a0 len 64

rdma write completion

server rdma write complete

server posted go ahead

send completion

cma_event type RDMA_CM_EVENT_DISCONNECTED cma_id 0x7fe504000a30 (child)

server DISCONNECT EVENT…

wait for RDMA_READ_ADV state 10

rping_free_buffers called on cb 0x17573c0

destroy cm_id 0x1760820

root@train-gpu10:~#

ib_send_bw result:

On the hard roce client:

root@train-gpu9:/etc/libibverbs.d # ib_send_bw -a -d mlx5_0 192.168.0.20

Requested SQ size might be too big. Try reducing TX depth and/or inline size.

Current TX depth is 128 and inline size is 0 .


Send BW Test

Dual-port : OFF Device : mlx5_0

Number of qps : 1 Transport type : IB

Connection type : RC Using SRQ : OFF

TX depth : 128

CQ Moderation : 100

Mtu : 1024[B]

Link type : Ethernet

GID index : 3

Max inline data : 0[B]

rdma_cm QPs : OFF

Data ex. method : Ethernet


local address: LID 0000 QPN 0x0106 PSN 0x7511e0

GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:00:19

remote address: LID 0000 QPN 0x0011 PSN 0xaaa186

GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:00:20


#bytes #iterations BW peak[MB/sec] BW average[MB/sec] MsgRate[Mpps]

Completion with error at client

Failed status 12: wr_id 0 syndrom 0x81

scnt=128, ccnt=0

On the softroce server:

root@train-gpu10:~# ib_send_bw -a


  • Waiting for client to connect… *


Send BW Test

Dual-port : OFF Device : rxe0

Number of qps : 1 Transport type : IB

Connection type : RC Using SRQ : OFF

RX depth : 512

CQ Moderation : 100

Mtu : 1024[B]

Link type : Ethernet

GID index : 1

Max inline data : 0[B]

rdma_cm QPs : OFF

Data ex. method : Ethernet


local address: LID 0000 QPN 0x0011 PSN 0xaaa186

GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:00:20

remote address: LID 0000 QPN 0x0106 PSN 0x7511e0

GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:00:19


#bytes #iterations BW peak[MB/sec] BW average[MB/sec] MsgRate[Mpps]

^C

root@train-gpu10:~#

Why can’t they work together? Can you tell us the correct vertion of softroce and hard roce that can work together?

Hi,

I suggest to review the following guide :

https://community.mellanox.com/s/article/howto-configure-soft-roce

Thanks,

Samer

The guide in the webpage is quite old. We tried it long time ago, it didn’t work at all.

Hi,

If you are using Mellanox adapters, there is no need to use soft RoCE

we fully support Hard RoCE , you can refer to the user manual of MLNX_OFED 4.7 :

https://docs.mellanox.com/pages/viewpage.action?pageId=19798092

Thanks,

Samer

I’m sorry, but I have to use softroce and hard roce together. Thank you for your help. On the other side, I think Martijn van Breugel (Mellanox) may know the answer. Can you forward this to him?

Hi,

Are you using Mellanox adapters in both sides ?

In addition you need to install the same OFED version and in order to get Soft-RoCE capabilities, you need to install the kernel and user space libraries on both servers.

If you would like further assistance you can open a support case at support@mellanox.com.

Thanks,

Samer

I use Mellanox adapters in both sides, but I only install mlnx-ofed4.5 on the hard-roce side. On the softroce side, I do not install mlnx-ofed4.5, this is also illustrated in https://community.mellanox.com/s/article/howto-configure-soft-roce

Hi ,

Please note that this scenario was never validated/tested and therefore it’s not supported

We always recommend to use the Hard-RoCE option in Mellanox adapters.

You can use the kernel inbox driver (not MLNX_OFED driver) and still have the hard-roce available.

Thanks

Samer

您好我想知道您是如何安装soft-Roce,为什么我使用 bash build.sh编译之后,使用make install 会报错呢?​