ethernet_read_keys: Couldn't read remote address

Background

Two host, host1 ib0: 192.168.2.1, host2 ib0: 192.168.2.2

Topology

host1 ib0 <=> MSX6018F-1SFS <=> host2 ib0

In host1:

ibv_devinfo hca_id: mlx4_0 transport: InfiniBand (0) fw_ver: 2.42.5000 node_guid: (hidden) sys_image_guid: (hidden) vendor_id: 0x02c9 vendor_part_id: 4099 hw_ver: 0x1 board_id: (hidden) phys_port_cnt: 1 Device ports: port: 1 state: PORT_ACTIVE (4) max_mtu: 4096 (5) active_mtu: 4096 (5) sm_lid: 1 port_lid: 1 port_lmc: 0x00 link_layer: InfiniBand

In host2:

ibv_devinfo hca_id: mlx4_0 transport: InfiniBand (0) fw_ver: 2.42.5000 node_guid: (hidden) sys_image_guid: (hidden) vendor_id: 0x02c9 vendor_part_id: 4099 hw_ver: 0x1 board_id: (hidden) phys_port_cnt: 1 Device ports: port: 1 state: PORT_ACTIVE (4) max_mtu: 4096 (5) active_mtu: 4096 (5) sm_lid: 1 port_lid: 3 port_lmc: 0x00 link_layer: InfiniBand

In host1:

ib_write_bw -a ************************************ * Waiting for client to connect... * ************************************ Couldn't get device attributes --------------------------------------------------------------------------------------- RDMA_Write BW Test Dual-port : OFF Device : mlx4_0 Number of qps : 1 Transport type : IB Connection type : RC Using SRQ : OFF CQ Moderation : 100 Mtu : 2048[B] Link type : IB Max inline data : 0[B] rdma_cm QPs : OFF Data ex. method : Ethernet --------------------------------------------------------------------------------------- local address: LID 0x01 QPN 0x0244 PSN 0x968878 RKey 0x60010100 VAddr 0x007f558e61d000 ethernet_read_keys: Couldn't read remote address Unable to read to socket/rdam_cm Failed to exchange data between server and clients

In host2:

ib_write_bw -a 192.168.2.1 Couldn't get device attributes Couldn't allocate MR failed to create mr Failed to create MR Couldn't create IB resources

But if I test this on the same host, everything is normal.

Why is there such a problem as me?

Host1 and host2 can ping each other, ethernet is unobstructed

Hi Hughen,

Can you please confirm if the subnet prefix is same for both the IP’s?

Thanks,

Namrata.

Hi Hughen,

Couple of other questions:

  1. Why does the output show few parameters “Hidden”. Was that done intentionally from your end?
  2. Are you using SR-IOV?
  3. It would be great if you could re-install the driver(MLNX OFED) and verify the tests again.

Thanks,

Namrata.

Hi Hughen,

It would be great if you could open a support case with us at support@mellanox.com for debug as we would require more details and information which might not be feasible via community.

Thanks,

Namrata.

Hi Hughen,

It would be great if you could provide the following information:

  1. Where is the SM running.
  2. Sysinfo snapshots from both the hosts.

The sysinfo tool takes a snapshot of your server with all the relevant information on Mellanox HCA.

To use the tool, please follow the instructions below:

  1. Download Sysinfo-Snapshot to the server and click on “download” at the bottom left.

https://mellanox.my.salesforce.com/sfc/p/500000007heg/a/1T0000001i7B/YvSyVR5jFuDhD.UogwGkICUGfYVJjY2qn0MzpEWnEDs

  1. Untar the file by invoking: tar xvf sysinfo-snapshot-.tgz

  2. Run the script: ./sysinfo-snapshot.py [flags options below]

3.1) –d | --dir sets destination directory (default is /tmp).

3.2) –v | --version prints the tool’s version and exit

3.3) –fw | --firmware adds firmware commands/functions to the output

3.4) --no_ib does not run server InfiniBand command

3.5) –json adds JSON file to the output​

3.6) –p|–perf include more performance commands/functions, e.g. ib_write_bw and ib_write_lat.

3.7) --ibdiagnet add ibdiagnet command to the output.

3.8) --pcie add pcie commands/functions to the output.

3.9) --check_fw check if the current adapter firmware is the latest

  1. You will get an output file named sysinfo-snapshot-v-HOSTNAME-DATE.tgz located under /tmp directory, where HOSTNAME is the name of the host and DATE is the date in format YYYYMMDD-HHMM. Output directory can be changed by using ‘-d’ parameter.

  2. Please send us the output file.