Jetson TX2 crash when using a PCIe ethernet card (NIC)

I try to use a PCIe NIC (10Gbit ethernet card) on TX2 (I use both JetPack 3.1 and 3.2).

https://www.asus.com/ca-en/Networking/XG-C100C/

I download the NIC driver source code from ASUS official site, and follow NVIDIA_Tegra_Linux_Driver_Package says, compiling driver on my Tegra platform.

cd /usr/src/linux-headers-(uname -r)
$ sudo make modules_prepare

and build the driver by

$ make

and I can use module by

$sudo insmod atlantic.ko

lsmod shows the module is running and I can get a new ethernet interface named “eth1” (the on board one named eth0)

Things were going well until I plug in the RJ45 cable and ubuntu network manager try using DHCP to maintain a IP for this network interface.
System suddenly black out and staring reboot without anything error message.

I assign static IP to this interface, the system does not reboot but when I use ping command to ping other IP, the system black out and starting reboot.

Do I need do something more when I using PCIe on TX2 ?

I don’t have a card to test, but it sounds mostly like an issue with user space configuration.

This may be something related since it may be a case of whether NetworkManager is in conflict with something:
http://xmodulo.com/disable-network-manager-linux.html

Although that URL is about disabling NetworkManager you might find that you want to disable or enable NetworkManager for one interface, but not for another. Perhaps use that URL to see and understand when NetworkManager has its hands in the pie or not. It can be quite frustrating to make a chance and only later discover your change didn’t take place because NetworkManager saw something seemingly unrelated change and decided to revert your changes or produce a conflicting configuration.

An example of when network interfaces can bring a system down is if one interface in some way overlaps another interface and the two end up in infinite interaction with each other. I once managed to do this with a bridge where MAC addresses were forwarded infinitely and it behaved similar to what you are describing.

Since some part of that module works under some specific circumstance it does tend to support the idea that it isn’t a driver issue, but most likely configuration. Do watch “dmesg --follow” while working (preferably from a serial console) and see if any useful messages are generated.

Hi linuxdev

I try to disable network manager and assign IP by command line. But come to the same result.

I use follow command to stop network manager

sudo systemctl stop NetworkManager.service sudo systemctl disable NetworkManager.service

and then manual up my pcie NIC interface

sudo ifconfig eth1 up

when I try use DHCP client (dhclient) to eth1, system stuck then reboot after about 3 sec.

And I also try disable other network interface ($ sudo ifconfig eth0 down), only remain eth1 and lo. But no luck.

Hi edward.lo,

Not sure if you have chance to catch the error log before device is down.
We have internally checked 10G NIC. You could try it if get one.
Intel Corp X540T2 Converged Network Adapter T2

Is it possible to show “ifconfig” and “route” before changes are added which cause a crash, and then show the commands you add from which it later ends up crashing? If it doesn’t crash immediately, can you show “ifconfig” and “route” output after the commands as well (you don’t necessarily need to run the “ifup” to see some changes in ifconfig or route, but if you can still run ifconfig or route after ifup see what it says)?

What did the “dmesg --follow” show?

Hi WayneWWW and linuxdev,

I got following dmesg when I insmod driver module

[51289.828175] aquantia 0000:01:00.0: enabling device (0000 -> 0002)

and after I execute ifconfig eth1 up, dmesg --follow showed to me

[51430.009732] IPv6: ADDRCONF(NETDEV_UP): eth1: link is not ready
[51496.138999] IPv6: ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready (<---- popup this line if I plug in RJ45 cable)

then I execute follow command

nvidia@tegra-ubuntu:~/Atlantic$ ifconfig
eth1      Link encap:Ethernet  HWaddr b0:6e:bf:a9:f2:d6
          inet6 addr: fe80::b26e:bfff:fea9:f2d6/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:356 errors:0 dropped:0 overruns:0 frame:0
          TX packets:23 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:37916 (37.9 KB)  TX bytes:3541 (3.5 KB)

l4tbr0    Link encap:Ethernet  HWaddr 0a:6b:05:0d:54:3a
          inet addr:192.168.55.1  Bcast:192.168.55.255  Mask:255.255.255.0
          inet6 addr: fe80::8049:e1ff:fe56:5d36/64 Scope:Link
          UP BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:10 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:857 (857.0 B)

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:282580 errors:0 dropped:0 overruns:0 frame:0
          TX packets:282580 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1
          RX bytes:20910784 (20.9 MB)  TX bytes:20910784 (20.9 MB)

usb0      Link encap:Ethernet  HWaddr 0a:6b:05:0d:54:3a
          UP BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

usb1      Link encap:Ethernet  HWaddr 32:3c:78:3d:a3:df
          UP BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

nvidia@tegra-ubuntu:~/Atlantic$ route
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
192.168.55.0    *               255.255.255.0   U     0      0        0 l4tbr0

command cause system crash is following

$ sudo dhclient eth1

after this command, system hanged up then reboot automatically and dmesg not refresh anymore.

something different in dmesg is when I use eth0, dmesg will pop up a line say

[52712.720340] eqos 2490000.ether_qos eth0: Link is Up - 100Mbps/Full - flow control rx/tx

but not showed when I use NIC card.

I see what looks like a bridge interface. Normally a bridge does not get an address, but it isn’t any issue if it does have an address. However, what does “route” say prior to the command which causes crash? If the assigned address overlaps with another device it is possible for this to cause such a crash. Can you try, prior to the command which crashes “sudo ifdown l4tbr0”? I doubt usb0 or usb1 are an issue, they do not show any TX or RX packets, they were never touched.

Even if you don’t get a DHCP response it would be useful to know if having l4tbr0 disabled prevents the crash. If this does change things significantly, then you’ve probably run into the same misconfiguration I had years ago.

Incidentally, how are the bridge and other interfaces physically wired in relation to each other? E.G., is there a switch eth1 goes to which in turn goes to one of the bridge interfaces?

I disabled NetworkManager. Then use ifdown to disable all network interface expect eth1 and lo. ( I disabled l4tbr0, usb0, usb1)

the network configuration on my TX2 look like

nvidia@tegra-ubuntu:~/Atlantic$ ifconfig
eth1      Link encap:Ethernet  HWaddr b0:6e:bf:a9:f2:d6  
          inet6 addr: fe80::b26e:bfff:fea9:f2d6/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:18 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:0 (0.0 B)  TX bytes:2371 (2.3 KB)

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:2198 errors:0 dropped:0 overruns:0 frame:0
          TX packets:2198 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1 
          RX bytes:138059 (138.0 KB)  TX bytes:138059 (138.0 KB)

and the route look like

nvidia@tegra-ubuntu:~/Atlantic$ route
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface

then system still crash after I enter “sudo dhclient eth1”

BTW, my TX2 pcie NIC use RJ45 cable connect to a HUB, and HUB connect to a router. DHCP server is on the router.

If you use serial console are you able to run “dmesg --follow” and see more output than from a regular terminal? If you have a fresh boot, and have not run the dhclient command yet, does “ifconfig” still show 18 TX packets?

What else is connected to that network switch besides the router?

I know the switch is probably 10Gbit, but as a test, can you see if putting a regular gigabit switch in changes things?