As the title says, I execute sudo iblinkinfo
on the host-side of server A and it reports the error:
ibwarn: [9475] _do_madrpc: send failed; Invalid argument
ibwarn: [9475] mad_rpc: _do_madrpc failed; dport (DR path slid 0; dlid 0; 0)
/var/tmp/rdma-core/rdma-core-54mlnx1/libibnetdisc/ibnetdisc.c:811; Failed to resolve self
discover failed
However, if I execute the same command on the dpu(Bluefield 2) side of server A, it succeeds.
In fact, when I send RDMA requests from another server B to the host side and dpu side of server A, they fail on the host side of A and succeed on the dpu side of A.
Could someone please provide me some ideas to solve the problem, thanks a lot.
P.S.
- The link state are normal as
ibstatus
works fine - The ofed version of host and dpu side are both MLNX_OFED_LINUX-5.4-1.0.3.0
- Both host and dpu side are equipped with Ubuntu OS