NETDEV WATCHDOG: eth0(mlx5_core): transmit queue timed out

hi all, we use openeuler 20.03 sp3 and centos 7, mlx5_core version is 5.0-0,some time show mlx5_core transmit queue timed out in /var/log/message ,and we upgrade mlx5_core drvier version to 5.8-2.0.3 ,still has the timeout problem.
any solution for how to resolve this? thanks you

What HCA model? Are you use IPoIB?

You can update to latest OFED and Firmware.

We do not use IPoIB,When the server use ip-in-ip, we discvoer anther problem, client can not connect to the server,and client’s TcpInCsumErrors increace in “nstat -az” command.when the server run “ethtool -k eth0 tx off”,then client can connect to server.
latest EN and OFED has this problem.
Is it a bug ?thank you

That is TX CHECKSUM offload, it should be off in your case.

In the kernel default driver 5.0-0,Do not need to turn it off, Update to new version, need to ,why is that? Will it fix in then next version?

