Packet drop on 10GbE interface

Please provide the following info (check/uncheck the boxes after clicking “+ Create Topic”):
Software Version
DRIVE OS Linux 5.2.0
DRIVE OS Linux 5.2.0 and DriveWorks 3.5
NVIDIA DRIVE™ Software 10.0 (Linux)
NVIDIA DRIVE™ Software 9.0 (Linux)
other DRIVE OS version
other

Target Operating System
Linux
QNX
other

Hardware Platform
NVIDIA DRIVE™ AGX Xavier DevKit (E3550)
NVIDIA DRIVE™ AGX Pegasus DevKit (E3550)
other

SDK Manager Version
1.5.0.7774
other

Host Machine Version
native Ubuntu 18.04
other

Hello,

I am trying to use 10GbE to transfer data between Xavier-a and Xavier-b. I am experiencing serious packet drop even with simply pinging Xavier-b from Xavier-a on this interface. Some facts about the current state:

  • This interface uses atlantic driver.
  • The packet drop is visible in this command’s output: ip -s link show enp4s0
  • The packet drop happens on rx and not tx
  • In ethtool -S enp4s0 's output it is visible that the packet drop is of type InDroppedDma
  • I increased the DMA buffer from its default value(2048) to maxismum value(8184) but it didn’t help.
  • This issue was also happening in driveos 5.1.6 and it still happens in 5.2

Do you have any idea on what can be the reason for this serious packet drop?

Dear @ebrahim4o8qh,
Could you share the reproducible steps with output logs in DRIVE OS 5.2.0?

Sure. From Xavier-B:

~$ ping 192.168.0.10
PING 192.168.0.10 (192.168.0.10) 56(84) bytes of data.
64 bytes from 192.168.0.10: icmp_seq=1 ttl=64 time=0.544 ms
64 bytes from 192.168.0.10: icmp_seq=2 ttl=64 time=0.200 ms
64 bytes from 192.168.0.10: icmp_seq=3 ttl=64 time=0.179 ms
64 bytes from 192.168.0.10: icmp_seq=4 ttl=64 time=0.182 ms
^C
--- 192.168.0.10 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3076ms

~$ ip -s link show  enp3s0
4: enp3s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 00:04:4b:f6:63:25 brd ff:ff:ff:ff:ff:ff
    RX: bytes  packets  errors  dropped overrun mcast   
    512        6        0       54      0       48      
    TX: bytes  packets  errors  dropped carrier collsns 
    3932       54       0       0       0       0

~$ ethtool -S enp3s0
NIC statistics:
     InPackets: 54
     InUCast: 6
     InMCast: 48
     InBCast: 0
     InErrors: 0
     OutPackets: 54
     OutUCast: 5
     OutMCast: 48
     OutBCast: 1
     InUCastOctets: 536
     OutUCastOctets: 472
     InMCastOctets: 3648
     OutMCastOctets: 3648
     InBCastOctets: 0
     OutBCastOctets: 64
     InOctets: 4184
     OutOctets: 4184
     InPacketsDma: 6
     OutPacketsDma: 54
     InOctetsDma: 512
     OutOctetsDma: 3932
     InDroppedDma: 54
     Queue[0] InPackets: 2
     Queue[0] OutPackets: 47
     Queue[0] Restarts: 0
     Queue[0] InJumboPackets: 0
     Queue[0] InLroPackets: 0
     Queue[0] InErrors: 0
     Queue[1] InPackets: 0
     Queue[1] OutPackets: 1
     Queue[1] Restarts: 0
     Queue[1] InJumboPackets: 0
     Queue[1] InLroPackets: 0
     Queue[1] InErrors: 0
     Queue[2] InPackets: 0
     Queue[2] OutPackets: 0
     Queue[2] Restarts: 0
     Queue[2] InJumboPackets: 0
     Queue[2] InLroPackets: 0
     Queue[2] InErrors: 0
     Queue[3] InPackets: 4
     Queue[3] OutPackets: 6
     Queue[3] Restarts: 0
     Queue[3] InJumboPackets: 0
     Queue[3] InLroPackets: 0
     Queue[3] InErrors: 0

Dear @ebrahim4o8qh,
Could you share ifconfig logs on both Tegra A and Tegra B?

ifconfig -a's output?

Xavier A:

~$ ifconfig -a
can0: flags=193<UP,RUNNING,NOARP>  mtu 16
        unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00  txqueuelen 10  (UNSPEC)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
        device interrupt 108  

can1: flags=193<UP,RUNNING,NOARP>  mtu 16
        unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00  txqueuelen 10  (UNSPEC)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
        device interrupt 109  

docker0: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500
        inet 172.17.0.1  netmask 255.255.0.0  broadcast 172.17.255.255
        inet6 fe80::42:bfff:fe30:e9ac  prefixlen 64  scopeid 0x20<link>
        ether 02:42:bf:30:e9:ac  txqueuelen 0  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 43  bytes 3866 (3.8 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

dummy0: flags=130<BROADCAST,NOARP>  mtu 1500
        ether 7e:8d:9e:ad:21:49  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

enP4p1s0: flags=4098<BROADCAST,MULTICAST>  mtu 1500
        ether 00:04:4b:f6:63:26  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

enp4s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 9000
        inet 192.168.0.10  netmask 255.255.255.0  broadcast 192.168.0.255
        inet6 fe80::204:4bff:fef6:6324  prefixlen 64  scopeid 0x20<link>
        ether 00:04:4b:f6:63:24  txqueuelen 1000  (Ethernet)
        RX packets 6  bytes 512 (512.0 B)
        RX errors 0  dropped 151  overruns 0  frame 0
        TX packets 151  bytes 10722 (10.7 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.71.10  netmask 255.255.255.0  broadcast 192.168.71.255
        inet6 fe80::204:4bff:fef6:6321  prefixlen 64  scopeid 0x20<link>
        ether 00:04:4b:f6:63:21  txqueuelen 1000  (Ethernet)
        RX packets 41912069  bytes 12796513866 (12.7 GB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 63000599  bytes 513925254625405 (513.9 TB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

eth0.200: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.42.0.28  netmask 255.255.255.0  broadcast 10.42.0.255
        inet6 fe80::204:4bff:fef6:6321  prefixlen 64  scopeid 0x20<link>
        ether 00:04:4b:f6:63:21  txqueuelen 1000  (Ethernet)
        RX packets 581405  bytes 27790250 (27.7 MB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 118352  bytes 6870284 (6.8 MB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 115171356  bytes 11107830160 (11.1 GB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 115171356  bytes 11107830160 (11.1 GB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

vcan0: flags=193<UP,RUNNING,NOARP>  mtu 72
        unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00  txqueuelen 1000  (UNSPEC)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0


Xavier B:

~$ ifconfig -a
can0: flags=193<UP,RUNNING,NOARP>  mtu 16
        unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00  txqueuelen 10  (UNSPEC)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
        device interrupt 106  

can1: flags=193<UP,RUNNING,NOARP>  mtu 16
        unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00  txqueuelen 10  (UNSPEC)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
        device interrupt 107  

docker0: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500
        inet 172.17.0.1  netmask 255.255.0.0  broadcast 172.17.255.255
        ether 02:42:b0:f3:f4:9e  txqueuelen 0  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

dummy0: flags=130<BROADCAST,NOARP>  mtu 1500
        ether 4e:5b:dc:8b:54:5f  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

enP4p1s0: flags=4098<BROADCAST,MULTICAST>  mtu 1500
        ether 00:04:4b:f6:63:27  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

enp3s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 9000
        inet 192.168.0.11  netmask 255.255.255.0  broadcast 192.168.0.255
        inet6 fe80::204:4bff:fef6:6325  prefixlen 64  scopeid 0x20<link>
        ether 00:04:4b:f6:63:25  txqueuelen 1000  (Ethernet)
        RX packets 6  bytes 512 (512.0 B)
        RX errors 0  dropped 150  overruns 0  frame 0
        TX packets 150  bytes 10652 (10.6 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.71.11  netmask 255.255.255.0  broadcast 192.168.71.255
        inet6 fe80::204:4bff:fef6:6322  prefixlen 64  scopeid 0x20<link>
        ether 00:04:4b:f6:63:22  txqueuelen 1000  (Ethernet)
        RX packets 9272154  bytes 9729939064 (9.7 GB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 2281049  bytes 620704765691211 (620.7 TB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

eth0.200: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.42.0.29  netmask 255.255.255.0  broadcast 10.42.0.255
        inet6 fe80::204:4bff:fef6:6322  prefixlen 64  scopeid 0x20<link>
        ether 00:04:4b:f6:63:22  txqueuelen 1000  (Ethernet)
        RX packets 581005  bytes 27773998 (27.7 MB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 117812  bytes 6841854 (6.8 MB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 3111396  bytes 786338941 (786.3 MB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 3111396  bytes 786338941 (786.3 MB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

Dear @ebrahim4o8qh,
Could you share the pics of ethernet connection? Also, please confirm if the version is DRIVE OS 5.2.0 Linux.

Yes it is DRIVE OS 5.2.0 Linux. I’ve also seen the same issue on DRIVE OS 5.1.6 Linux

The pic of Ethernet connection:

Dear @ebrahim4o8qh,
So board is not connected to internet? I see lan wire connecting two ports.

It is also connected to internet but these two ports are used to connect them to each other. Is it reproducible on your side?

Hi @ebrahim4o8qh ,

May I know why you increased the mtu setting?

I also observed InDroppedDma and will check internally.

nvidia@tegra-ubuntu:~$ ethtool -S enp3s0
NIC statistics:
InPackets: 2504
InUCast: 20
InMCast: 1752
InBCast: 732
InErrors: 0
OutPackets: 22
OutUCast: 0
OutMCast: 20
OutBCast: 2
InUCastOctets: 1886
OutUCastOctets: 0
InMCastOctets: 425202
OutMCastOctets: 1716
InBCastOctets: 103490
OutBCastOctets: 638
InOctets: 530578
OutOctets: 2354
InPacketsDma: 753
OutPacketsDma: 22
InOctetsDma: 102596
OutOctetsDma: 2266
InDroppedDma: 22
Queue[0] InPackets: 739
Queue[0] OutPackets: 3
Queue[0] Restarts: 0
Queue[0] InJumboPackets: 0
Queue[0] InLroPackets: 0
Queue[0] InErrors: 0
Queue[1] InPackets: 0
Queue[1] OutPackets: 0
Queue[1] Restarts: 0
Queue[1] InJumboPackets: 0
Queue[1] InLroPackets: 0
Queue[1] InErrors: 0
Queue[2] InPackets: 0
Queue[2] OutPackets: 8
Queue[2] Restarts: 0
Queue[2] InJumboPackets: 0
Queue[2] InLroPackets: 0
Queue[2] InErrors: 0
Queue[3] InPackets: 14
Queue[3] OutPackets: 11
Queue[3] Restarts: 0
Queue[3] InJumboPackets: 0
Queue[3] InLroPackets: 0
Queue[3] InErrors: 0

Hi @VickNV ,

Thanks for the follow-ups. This MTU size is probably left from old configurations we had, but I also tried 1500 and it didn’t solve the issue. On your side, no packet drop happens when MTU is 1500?

Yes, I saw the packet drop issue on 1500 MTU too.

Hi @VickNV
Could you find out the reason?

Thanks for reminding us. We will check with the internal team.
May I know if this is blocking your development?

Thanks for the follow up. We are not blocked by this. but this can have some big performance gain for us which is important to us

May I know how you tell this impacts the performance seriously? Thanks.

I mean we are using another interface for 2 types of traffic. we could separate it into 2 different interfaces this way, if it was working without drop. Unfortunately I can’t go into more details

According to our testing, this issue doesn’t cause serious performance dropping.
So before it’s fixed, you can still use the interface.

We have serious packet drop on that interface, even with ping command on that interface. How can we use it?

I remember in idle I observed some increase in the drop count every few seconds.
Is your observation the same? Why do you think it’s serious? Thanks.