I am trying to install and test two server each with a ConnectX5 card. Two servers are originally connected in local network. These two ConnectX5 cards are added and connected in host-chaining fashion. ib_send_bw test is successed with their originally IP. However, rping and other tests are failed with “RDMA_CM_EVENT_ADDR_ERROR, error -19”.
I am not an expert in network thing. I know maybe I should assign IP for ConnectX5 card or something after reading some guides. But I cannot find how to do it exactly.
So, what should I do to? Thanks
Here few points to check :
Did you run ib_write_bw with -R (rdma_cm) too ?
Do you have a ping between all the machines in your host-chaining setup ?
Do you use unidirectional or bidirectional topology ?
Check again your firmware settings :
mlxconfig -d /dev/mst/mt4119_pciconf0 q | grep HOST_CHAINING_MODE
If not set correctly, please apply :
§mlxconfig -d /dev/mst/mt4119_pciconf0 set HOST_CHAINING_MODE=1 on all the ports.
Thanks for your repley.
- Adding -R results in failure of ib_write_bw with error message “Unable to perform rdma_client function, Unable to init the socket connection”.
- I can ping between servers with originally local IP, but failed using the IP I assisned for mlx5 card.
- I have two servers. Two cables connect each two ports of two cards. So it’s bidirectional I guess.
- I checked HOST_CHAINING_MODE on both servers, showing ‘HOST_CHAINING_MODE BASIC(1)’
Another problem is that I assigned a dedicated IP for both cards with
“ifconfig ens16f0 126.96.36.199/24 up” and ifconfig ens16f0 188.8.131.52/24 up. But I cannot ping or ib_write_bw between them with “184.108.40.206” and “220.127.116.11”.
How should I assign a dedicated IP for mlx5 cards correctly?
I attach a schema of a bidirectional host chaining ring.
For more information and debugging please open a case firstname.lastname@example.org