mlx5_0 port is Down !

Hi folks,

I dont why my eth4 is down. Thanks.

$ibdev2netdev

mlx5_0 port 1 ==> eth4 (Down)

mlx5_1 port 1 ==> eth5 (Up)

$ibv_devinfo

hca_id: mlx5_1

transport: InfiniBand (0)

fw_ver: 14.21.2010

node_guid: 248a:0703:00b7:5fe7

sys_image_guid: 248a:0703:00b7:5fe6

vendor_id: 0x02c9

vendor_part_id: 4117

hw_ver: 0x0

board_id: MT_2420110034

phys_port_cnt: 1

Device ports:

port: 1

state: PORT_ACTIVE (4)

max_mtu: 4096 (5)

active_mtu: 1024 (3)

sm_lid: 0

port_lid: 0

port_lmc: 0x00

link_layer: Ethernet

hca_id: mlx5_0

transport: InfiniBand (0)

fw_ver: 14.21.2010

node_guid: 248a:0703:00b7:5fe6

sys_image_guid: 248a:0703:00b7:5fe6

vendor_id: 0x02c9

vendor_part_id: 4117

hw_ver: 0x0

board_id: MT_2420110034

phys_port_cnt: 1

Device ports:

port: 1

state: PORT_DOWN (1)

max_mtu: 4096 (5)

active_mtu: 1024 (3)

sm_lid: 0

port_lid: 0

port_lmc: 0x00

link_layer: Ethernet

$sudo mst status

MST modules:


MST PCI module is not loaded

MST PCI configuration module loaded

MST devices:


/dev/mst/mt4117_pciconf0 - PCI configuration cycles access.

domain:bus:dev.fn=0000:01:00.0 addr.reg=88 data.reg=92

Chip revision is: 00

/dev/mst/mt4117_pciconf1 - PCI configuration cycles access.

domain:bus:dev.fn=0000:81:00.0 addr.reg=88 data.reg=92

Chip revision is: 00

I am a newer of DPDK . and . mellanox.

The origin net configure :

bond0: flags=5187<UP,BROADCAST,RUNNING,MASTER,MULTICAST> mtu 1500

inet 11.139.179.201 netmask 255.255.255.192 broadcast 11.139.179.255

ether 24:8a:07:b4:f7:2e txqueuelen 0 (Ethernet)

RX packets 1204082 bytes 819153403 (781.2 MiB)

RX errors 0 dropped 3979 overruns 0 frame 0

TX packets 1438017 bytes 531230030 (506.6 MiB)

TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

docker0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500

inet 192.168.5.1 netmask 255.255.255.0 broadcast 0.0.0.0

ether 02:42:c0:a8:05:01 txqueuelen 0 (Ethernet)

RX packets 0 bytes 0 (0.0 B)

RX errors 0 dropped 0 overruns 0 frame 0

TX packets 0 bytes 0 (0.0 B)

TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

eth4: flags=6211<UP,BROADCAST,RUNNING,SLAVE,MULTICAST> mtu 1500

ether 24:8a:07:b4:f7:2e txqueuelen 1000 (Ethernet)

RX packets 734222 bytes 466093124 (444.5 MiB)

RX errors 0 dropped 0 overruns 0 frame 0

TX packets 970333 bytes 371897002 (354.6 MiB)

TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

eth5: flags=6211<UP,BROADCAST,RUNNING,SLAVE,MULTICAST> mtu 1500

ether 24:8a:07:b4:f7:2e txqueuelen 1000 (Ethernet)

RX packets 469860 bytes 353060279 (336.7 MiB)

RX errors 0 dropped 1 overruns 0 frame 0

TX packets 467684 bytes 159333028 (151.9 MiB)

TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536

inet 127.0.0.1 netmask 255.0.0.0

loop txqueuelen 0 (Local Loopback)

RX packets 434513 bytes 112254013 (107.0 MiB)

RX errors 0 dropped 0 overruns 0 frame 0

TX packets 434513 bytes 112254013 (107.0 MiB)

TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

I want to use dpdk on mellanox NIC.

So I installed the OFED.

And then I ran the l2fwd with command “./build/l2fwd -c 0x1 -n 4 -w 01:00.1 – -p 0x1”.

All is successful.

But I dont know when and where, the eth4 port is down.

And what’s the mean of “are they connected to other servers” ?

What other info should I offer to you? Thanks s much !

I have a new test of this.

1. configure environment.

2. sudo hca_self_test.ofed

$sudo hca_self_test.ofed

---- Performing Adapter Device Self Test ----

Number of CAs Detected … 2

PCI Device Check … PASS

Kernel Arch … x86_64

Host Driver Version … mlnx-en-4.0-2.0.0.1 (OFED-4.0-2.0.0): 3.10.0-327.el7.x86_64

Host Driver RPM Check … PASS

Firmware on CA #0 NIC … v14.21.2010

Firmware on CA #1 NIC … v14.21.2010

Host Driver Initialization … PASS

Number of CA Ports Active … 2

Port State of Port #1 on CA #0 (NIC)… UP 1X EDR (Ethernet)

Port State of Port #1 on CA #1 (NIC)… UP 1X EDR (Ethernet)

Error Counter Check on CA #0 (NIC)… PASS

Error Counter Check on CA #1 (NIC)… PASS

Kernel Syslog Check … PASS

Node GUID on CA #0 (NIC) … 24:8a:07:03:00:b4:f7:2e

Node GUID on CA #1 (NIC) … 24:8a:07:03:00:b4:f7:2f

------------------ DONE ---------------------

3. test dpdk: testpmd

sudo ./testpmd -c 0xff00 -n 4 -w 0000:01:00.1 – --rxq=2 --txq=2 -i

EAL: Detected 64 lcore(s)

EAL: No free hugepages reported in hugepages-1048576kB

EAL: Probing VFIO support…

EAL: PCI device 0000:01:00.1 on NUMA socket 0

EAL: probe driver: 15b3:1015 net_mlx5

PMD: net_mlx5: PCI information matches, using device “mlx5_1” (SR-IOV: false, MPS: true)

PMD: net_mlx5: 1 port(s) detected

PMD: net_mlx5: MPS is enabled

PMD: net_mlx5: port 1 MAC address is 24:8a:07:b4:f7:2e

Interactive-mode selected

USER1: create a new mbuf pool <mbuf_pool_socket_0>: n=203456, size=2176, socket=0

Configuring Port 0 (socket 0)

PMD: net_mlx5: 0xce4d80: TX queues number update: 0 → 2

PMD: net_mlx5: 0xce4d80: RX queues number update: 0 → 2

Port 0: 24:8A:07:B4:F7:2E

Checking link statuses…

Done

testpmd> start

io packet forwarding - ports=1 - cores=1 - streams=2 - NUMA support enabled, MP over anonymous pages disabled

Logical Core 9 (socket 0) forwards packets on 2 streams:

RX P=0/Q=0 (socket 0) → TX P=0/Q=0 (socket 0) peer=02:00:00:00:00:00

RX P=0/Q=1 (socket 0) → TX P=0/Q=1 (socket 0) peer=02:00:00:00:00:00

io packet forwarding - CRC stripping enabled - packets/burst=32

nb forwarding cores=1 - nb forwarding ports=1

RX queues=2 - RX desc=128 - RX free threshold=0

RX threshold registers: pthresh=0 hthresh=0 wthresh=0

TX queues=2 - TX desc=512 - TX free threshold=0

TX threshold registers: pthresh=0 hthresh=0 wthresh=0

TX RS bit threshold=0 - TXQ flags=0x0

testpmd> show port stats all

######################## NIC statistics for port 0 ########################

RX-packets: 256 RX-missed: 0 RX-bytes: 22364

RX-errors: 0

RX-nombuf: 0

TX-packets: 0 TX-errors: 0 TX-bytes: 0

Throughput (since last show)

Rx-pps: 0

Tx-pps: 0

############################################################################

testpmd> stop

Telling cores to stop…

Waiting for lcores to finish…

---------------------- Forward statistics for port 0 ----------------------

RX-packets: 0 RX-dropped: 0 RX-total: 0

TX-packets: 0 TX-dropped: 0 TX-total: 0


+++++++++++++++ Accumulated forward statistics for all ports+++++++++++++++

RX-packets: 0 RX-dropped: 0 RX-total: 0

TX-packets: 0 TX-dropped: 0 TX-total: 0

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Done.

It seems ok.

4. AGAIN :

sudo hca_self_test.ofed

---- Performing Adapter Device Self Test ----

Number of CAs Detected … 2

PCI Device Check … PASS

Kernel Arch … x86_64

Host Driver Version … mlnx-en-4.0-2.0.0.1 (OFED-4.0-2.0.0): 3.10.0-327.el7.x86_64

Host Driver RPM Check … PASS

Firmware on CA #0 NIC … v14.21.2010

Firmware on CA #1 NIC … v14.21.2010

Host Driver Initialization … PASS

Number of CA Ports Active … 1

Port State of Port #1 on CA #0 (NIC)… UP 1X EDR (Ethernet)

Port State of Port #1 on CA #1 (NIC)… DOWN (Ethernet)

Error Counter Check on CA #0 (NIC)… PASS

Error Counter Check on CA #1 (NIC)… PASS

Kernel Syslog Check … PASS

Node GUID on CA #0 (NIC) … 24:8a:07:03:00:b4:f7:2e

Node GUID on CA #1 (NIC) … 24:8a:07:03:00:b4:f7:2f

------------------ DONE ---------------------

We can see that " Port State of Port #1 on CA #1 (NIC)… DOWN (Ethernet) " .

I guess there may be a problem with this parameter: -w 0000:01:00.1 ?

Thanks!

Do you have a cable in both ports and are they connected to other servers?

Hi kernel Peng,

Thank you for posting your question on the Mellanox Community.

Based on the information provided, please follow the instructions from the following links for successful installation of DPDK in combination with MLNX_OFED

MLX5 poll mode driver → 36. MLX5 Ethernet Poll Mode Driver — Data Plane Development Kit 22.07.0 documentation 36. MLX5 Ethernet Poll Mode Driver — Data Plane Development Kit 22.07.0 documentation

** Make sure you install MLNX_OFED with the options ‘–dpdk --upstream-libs’

** Verify / Make sure that all ports are set to Eth

** Verify that the firmware is updated to the supported version which is mentioned in the MLXN_OFED Release Notes → http://www.mellanox.com/related-docs/prod_software/Mellanox_OFED_Linux_Release_Notes_4_3-1_0_1_0.pdf http://www.mellanox.com/related-docs/prod_software/Mellanox_OFED_Linux_Release_Notes_4_3-1_0_1_0.pdf

After successful installation of MLNX_OFED, continue with the installation of the latest DPDK version → https://fast.dpdk.org/rel/dpdk-18.02.1.tar.xz https://fast.dpdk.org/rel/dpdk-18.02.1.tar.xz

Please follow the getting started instructions from the following link → Getting Started Guide for Linux — Data Plane Development Kit 22.07.0 documentation Getting Started Guide for Linux — Data Plane Development Kit 22.07.0 documentation

If you still encounter any issues after these step, we suggest to open a support case through Mellanox Support so we can assist you accordingly.

You can open a new support ticket by sending an email to support@mellanox.com mailto:support@mellanox.com

Thanks and regards,

~Mellanox Technical Support

Hello -

If you port is down it means you do not have a cable plugged in.

thanks -steve

2 Likes

But, the eth4 port is up before I installed OFED and ran l2fwd.

eth4 and eth5 are bonding to bond0.

I would simplify your setup first.

  • remove the bonding

  • check your OFED and FW versions are latest.

  • connect 2 servers back2back

  • setup ports exactly the same except different IPs

  • test (ping,iperf,etc…)

  • then start using more sophisticated setup

Hi shanshanpt Infrastructure & Networking - NVIDIA Developer Forums

I think you have a mismatch between the DPDK mlx5 PMD driver to the MLNX_OFED (which is not loaded).

Because in the Host Driver Version you need to see MLNX_OFED and not mlnx-en-4.0-2.0.0.1

Please check if the MLNX_OFED drivers are loaded and retry

1 Like

It is recommended to upgrade to the latest driver, I think the system kernel is not correctly loaded into the NIC driver