RoCEv2 GID disappeared ?

Hi everybody !

how are connectx-5 GID created/initialized ? they disappeared after ofed upgrade…

ibv_devinfo -v neither show_gids display any GID…

Any ideas

regards, raph

after upgrade to MLNX_OFED_LINUX-4.4-1.0.0.0-debian9.1-x86_64

scisoft13:~ % sudo show_gids

DEV PORT INDEX GID IPv4 VER DEV


n_gids_found=0

before upgrade

scisoft13:~ % sudo show_gids

DEV PORT INDEX GID IPv4 VER DEV


mlx4_0 1 0 fe80:0000:0000:0000:526b:4bff:fe4f:be21 v1 enp131s0

mlx4_0 2 0 fe80:0000:0000:0000:526b:4bff:fe4f:be22 v1 enp131s0d1

mlx5_0 1 0 fe80:0000:0000:0000:526b:4bff:fed3:d164 v1 enp130s0f0

mlx5_0 1 1 fe80:0000:0000:0000:526b:4bff:fed3:d164 v2 enp130s0f0

mlx5_0 1 2 0000:0000:0000:0000:0000:ffff:c0a8:030d 192.168.3.13 v1 enp130s0f0

mlx5_0 1 3 0000:0000:0000:0000:0000:ffff:c0a8:030d 192.168.3.13 v2 enp130s0f0

mlx5_1 1 0 fe80:0000:0000:0000:526b:4bff:fed3:d165 v1 enp130s0f1

mlx5_1 1 1 fe80:0000:0000:0000:526b:4bff:fed3:d165 v2 enp130s0f1

n_gids_found=8

hca_id: mlx5_1

transport: InfiniBand (0)

fw_ver: 16.23.1000

node_guid: 506b:4b03:00d3:d185

sys_image_guid: 506b:4b03:00d3:d184

vendor_id: 0x02c9

vendor_part_id: 4119

hw_ver: 0x0

board_id: MT_0000000012

phys_port_cnt: 1

max_mr_size: 0xffffffffffffffff

page_size_cap: 0xfffffffffffff000

max_qp: 262144

max_qp_wr: 32768

device_cap_flags: 0xe5721c36

BAD_PKEY_CNTR

BAD_QKEY_CNTR

AUTO_PATH_MIG

CHANGE_PHY_PORT

PORT_ACTIVE_EVENT

SYS_IMAGE_GUID

RC_RNR_NAK_GEN

XRC

Unknown flags: 0xe5620000

continued…

device_cap_exp_flags: 0x520DF8F100000000

EXP_DC_TRANSPORT

EXP_CROSS_CHANNEL

EXP_MR_ALLOCATE

EXT_ATOMICS

EXT_SEND NOP

EXP_UMR

EXP_ODP

EXP_RX_CSUM_TCP_UDP_PKT

EXP_RX_CSUM_IP_PKT

EXP_MASKED_ATOMICS

EXP_RX_TCP_UDP_PKT_TYPE

EXP_SCATTER_FCS

EXP_WQ_DELAY_DROP

EXP_PHYSICAL_RANGE_MR

EXP_UMR_FIXED_SIZE

Unknown flags: 0x200000000000

max_sge: 30

max_sge_rd: 30

max_cq: 16777216

max_cqe: 4194303

max_mr: 16777216

max_pd: 16777216

max_qp_rd_atom: 16

max_ee_rd_atom: 0

max_res_rd_atom: 4194304

max_qp_init_rd_atom: 16

max_ee_init_rd_atom: 0

atomic_cap: ATOMIC_HCA (1)

log atomic arg sizes (mask) 0x8

continued…

masked_log_atomic_arg_sizes (mask) 0x3c

masked_log_atomic_arg_sizes_network_endianness (mask) 0x34

max fetch and add bit boundary 64

log max atomic inline 5

max_ee: 0

max_rdd: 0

max_mw: 16777216

max_raw_ipv6_qp: 0

max_raw_ethy_qp: 0

max_mcast_grp: 2097152

max_mcast_qp_attach: 240

max_total_mcast_qp_attach: 503316480

max_ah: 2147483647

max_fmr: 0

max_srq: 8388608

max_srq_wr: 32767

max_srq_sge: 31

max_pkeys: 128

local_ca_ack_delay: 16

hca_core_clock: 78125

max_klm_list_size: 65536

max_send_wqe_inline_klms: 20

max_umr_recursion_depth: 4

max_umr_stride_dimension: 1

general_odp_caps:

ODP_SUPPORT

ODP_SUPPORT_IMPLICIT

max_size: 0xFFFFFFFFFFFFFFFF

rc_odp_caps:

SUPPORT_SEND

SUPPORT_RECV

SUPPORT_WRITE

SUPPORT_READ

uc_odp_caps:

NO SUPPORT

ud_odp_caps:

SUPPORT_SEND

dc_odp_caps:

SUPPORT_SEND

SUPPORT_WRITE

SUPPORT_READ

xrc_odp_caps:

NO SUPPORT

raw_eth_odp_caps:

NO SUPPORT

max_dct: 262144

max_device_ctx: 1020

Multi-Packet RQ supported

Supported for objects type:

IBV_EXP_MP_RQ_SUP_TYPE_SRQ_TM

IBV_EXP_MP_RQ_SUP_TYPE_WQ_RQ

continued…

Supported payload shifts:

2 bytes

Log number of strides for single WQE: 3 - 16

Log number of bytes in single stride: 6 - 13

VLAN offloads caps:

C-VLAN stripping offload

C-VLAN insertion offload

rx_pad_end_addr_align: 64

tso_caps:

max_tso: 262144

supported_qp:

SUPPORT_RAW_PACKET

packet_pacing_caps:

qp_rate_limit_min: 0kbps

qp_rate_limit_max: 0kbps

ooo_caps:

ooo_rc_caps = 0x1

ooo_xrc_caps = 0x1

ooo_dc_caps = 0x1

ooo_ud_caps = 0x0

SUPPORT_RC_RW_DATA_PLACEMENT

SUPPORT_XRC_RW_DATA_PLACEMENT

SUPPORT_DC_RW_DATA_PLACEMENT

sw_parsing_caps:

SW_PARSING

SW_PARSING_CSUM

SW_PARSING_LSO

supported_qp:

SUPPORT_RAW_PACKET

tag matching not supported

tunnel_offloads_caps:

TUNNEL_OFFLOADS_VXLAN

TUNNEL_OFFLOADS_GRE

TUNNEL_OFFLOADS_GENEVE

UMR fixed size:

max entity size: 2147483648

Device ports:

port: 1

state: PORT_ACTIVE (4)

max_mtu: 4096 (5)

active_mtu: 1024 (3)

sm_lid: 0

port_lid: 0

port_lmc: 0x00

link_layer: Ethernet

continued…

max_msg_sz: 0x40000000

port_cap_flags: 0x04010000

max_vl_num: invalid value (0)

bad_pkey_cntr: 0x0

qkey_viol_cntr: 0x0

sm_sl: 0

pkey_tbl_len: 1

gid_tbl_len: 256

subnet_timeout: 0

init_type_reply: 0

active_width: 4X (2)

active_speed: 25.0 Gbps (32)

phys_state: LINK_UP (5)

reply…

thank you for helping ! it is strange because everything was ok before

upgrade.

  • ping is ok

  • /sys/class/infiniband/ etc exists and is populated but gids…

after upgrade to MLNX_OFED_LINUX-4.4-1.0.0.0-debian9.1-x86_64

scisoft13:~ % sudo show_gids

DEV PORT INDEX GID IPv4 VER DEV


reply…

scisoft13:~ % cat /sys/class/infiniband/mlx5_0/ports/1/gids/0

0000:0000:0000:0000:0000:0000:0000:0000

scisoft13:~ % cat /sys/class/infiniband/mlx5_0/ports/1/gids/1

0000:0000:0000:0000:0000:0000:0000:0000

scisoft13:~ % cat /sys/class/infiniband/mlx5_0/ports/1/gid_attrs/types/0

cat: /sys/class/infiniband/mlx5_0/ports/1/gid_attrs/types/0: Invalid argument

same issue on 2 servers with connectx-5 EN 100Gb/s optical link and connectx-3 40GBb/s copper link

ofed install without issue

Hi Raphael.

What is the output for “show_gids” after the driver installation?

Thank you,

Karen.

thank you for helping ! it is strange because everything was ok before

upgrade.

  • ping is ok

  • /sys/class/infiniband/ etc exists and is populated but gids…

after upgrade to MLNX_OFED_LINUX-4.4-1.0.0.0-debian9.1-x86_64

scisoft13:~ % sudo show_gids

DEV PORT INDEX GID IPv4 VER DEV


scisoft13:~ % cat /sys/class/infiniband/mlx5_0/ports/1/gids/0

0000:0000:0000:0000:0000:0000:0000:0000

scisoft13:~ % cat /sys/class/infiniband/mlx5_0/ports/1/gids/1

0000:0000:0000:0000:0000:0000:0000:0000

scisoft13:~ % cat /sys/class/infiniband/mlx5_0/ports/1/gid_attrs/types/0

cat: /sys/class/infiniband/mlx5_0/ports/1/gid_attrs/types/0: Invalid argument

same issue on 2 servers with connectx-5 EN 100Gb/s optical link and connectx-3 40GBb/s copper link

ofed install without issue

Hi Raphael,

Thank you for the information, it looks as an unexpected behaviour related to the driver and this specific Operating system.

For us to continue and investigate it please send an email to support@mellanox.com mailto:support@mellanox.com and open a support ticket with all the details.

Thank you,

Karen.