We have a Xavier with a custom carrier board, but very similar to the dev kit. We’re using Jetpack 4.2.2.
When there is heavy inbound scp traffic into the Xavier on the wired ethernet port (using the eqos driver) the port will often hang. The rx packet count stops incrementing in this state. A reboot fixes it, obviously, but so does ifconfig down/up and a network manager restart. This is very easy to reproduce - copy a large file to the Xavier over the wired link and it will fail within 30 seconds.
The problem only happens with heavy inbound traffic. Outbound traffic isn’t a problem.
The wireless card still works, and I can get into the Xavier that way even when the eqos port is hung, so the network stack isn’t completely borked.
There are no obvious kernel errors in the logs. I also tried putting a Jetpack 4.4 eqos driver (4.4 eqos driver source built in the 4.2.2 tree) on the Xavier and it behaved exactly the same.
Curiously, this problem doesn’t happen when pushing data to the Xavier using other networking tools like netcat (TCP) or iperf. It’s only a problem for scp inbound traffic. Wireshark shows what you would expect - the Xavier behaves like it’s not receiving any more packets on that port.
I’ve also tried disabling things like scatter-gather, TSO, etc. on the Xavier with no improvement.
Any bright ideas would be appreciated.
Thanks
Jim