Hi there,
I am currently evaluating the Jetson family and need a 4x GigE PCIe card for a project.
The first test was on the big TX2 board with a Intel i340, which worked perfectly fine.
After this I tried the same card with a Jetson nano in an AUVIDEA board, which has PCIe x4 routed to a M.2 MVME PCIe x4 connector.
Tests with the i340 showed the following problems resulting in the nano not to boot:
Kernel version:
R32 (release), REVISION: 5.1, GCID: 26202423, BOARD: t210ref, EABI: aarch64, DATE: Fri Feb 19 16:45:52 UTC 2021
Starting kernel …
[ 0.000000] Booting Linux on physical CPU 0x0
[ 0.000000] Linux version 4.9.201-tegra (buildbrain@mobile-u64-4415) (gcc version 7.3.1 20180425 [linaro-7.3-2018.05 revision d29120a424ecfbc167ef90065c0eeb7f91977701] (Linaro GCC 7.3-2018.05) ) #1 SMP PREEMPT Fri Jan 15 14:41:02 PST 2021
[ 0.000000] Boot CPU: AArch64 Processor [411fd071]
[ 0.000000] OF: fdt:memory scan node memory@80000000, reg size 32,
[ 0.000000] OF: fdt: - 80000000 , 7ee00000
[ 0.000000] OF: fdt: - 100000000 , 7f200000
[ 0.000000] Found tegra_fbmem: 00800000@92cb4000
[ 0.000000] earlycon: uart8250 at MMIO32 0x0000000070006000 (options ‘’)
[ 0.000000] bootconsole [uart8250] enabled
[ 0.389790] Node fragement does not have Overlay child
[ 0.389817] Error in parsing node /plugin-manager/fragement@10: -22
[ 1.159489] pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=0008(Receiver ID)
[ 1.159494] pcieport 0000:00:01.0: device [10de:0fae] error status/mask=00000081/00002000
[ 1.159498] pcieport 0000:00:01.0: [ 0] Receiver Error (First)
[ 1.159503] pcieport 0000:00:01.0: [ 7] Bad DLLP
[ 1.162979] tegradc tegradc.1: dpd enable lookup fail:-19
[ 1.224627] pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=0008(Receiver ID)
[ 1.234839] pcieport 0000:00:01.0: device [10de:0fae] error status/mask=00000081/00002000
[ 1.243444] pcieport 0000:00:01.0: [ 0] Receiver Error (First)
[ 1.250291] pcieport 0000:00:01.0: [ 7] Bad DLLP
[ 1.281005] pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=0008(Receiver ID)
[ 1.291210] pcieport 0000:00:01.0: device [10de:0fae] error status/mask=00000081/00002000
[ 1.299817] pcieport 0000:00:01.0: [ 0] Receiver Error (First)
[ 1.299822] pcieport 0000:00:01.0: [ 7] Bad DLLP
[ 1.502528] pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=0008(Receiver ID)
[ 1.502533] pcieport 0000:00:01.0: device [10de:0fae] error status/mask=00000081/00002000
[ 1.502537] pcieport 0000:00:01.0: [ 0] Receiver Error (First)
[ 1.502541] pcieport 0000:00:01.0: [ 7] Bad DLLP
[ 1.504833] pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=0008(Receiver ID)
[ 1.504837] pcieport 0000:00:01.0: device [10de:0fae] error status/mask=00000081/00002000
[ 1.504841] pcieport 0000:00:01.0: [ 0] Receiver Error (First)
[ 1.504845] pcieport 0000:00:01.0: [ 7] Bad DLLP
[ 1.507739] pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=0008(Receiver ID)
[ 1.507743] pcieport 0000:00:01.0: device [10de:0fae] error status/mask=00000081/00002000
[ 1.507747] pcieport 0000:00:01.0: [ 0] Receiver Error (First)
[ 1.507751] pcieport 0000:00:01.0: [ 7] Bad DLLP
[ 1.511508] pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=0008(Receiver ID)
[ 1.511512] pcieport 0000:00:01.0: device [10de:0fae] error status/mask=00000081/00002000
[ 1.511516] pcieport 0000:00:01.0: [ 0] Receiver Error (First)
[ 1.511519] pcieport 0000:00:01.0: [ 7] Bad DLLP
[ 1.674303] pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=0008(Receiver ID)
[ 1.674306] pcieport 0000:00:01.0: device [10de:0fae] error status/mask=00000081/00002000
[ 1.674310] pcieport 0000:00:01.0: [ 0] Receiver Error (First)
[ 1.674313] pcieport 0000:00:01.0: [ 7] Bad DLLP
[ 1.681416] pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=0008(Receiver ID)
[ 1.681420] pcieport 0000:00:01.0: device [10de:0fae] error status/mask=00000081/00002000
[ 1.681423] pcieport 0000:00:01.0: [ 0] Receiver Error (First)
[ 1.681533] pcieport 0000:00:01.0: [ 7] Bad DLLP
[ 1.846606] pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=0008(Receiver ID)
[ 1.856890] pcieport 0000:00:01.0: device [10de:0fae] error status/mask=00000081/00002000
[ 1.865340] pcieport 0000:00:01.0: [
[ 23.338211] INFO: rcu_sched detected stalls on CPUs/tasks:
[ 23.345854] 0-…: (0 ticks this GP) idle=fe1/140000000000002/0 softirq=410/410 fqs=0
[ 23.356024] (detected by 1, t=5252 jiffies, g=-272, c=-273, q=60)
[ 23.364410] rcu_sched kthread starved for 5252 jiffies! g18446744073709551344 c18446744073709551343 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
[ 23.378865] INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 23.386762] 0-…: (1 GPs behind) idle=fe1/140000000000002/0 softirq=401/410 fqs=2479
[ 23.396983] (detected by 1, t=5261 jiffies, g=-118, c=-119, q=31938)
[ 24.326216] Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 0
[ 24.335692] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 4.9.201-tegra #1
[ 24.344456] Hardware name: NVIDIA Jetson Nano Developer Kit (DT)
[ 24.352687] Call trace:
[ 24.357350] [] dump_backtrace+0x0/0x198
[ 24.364987] [] show_stack+0x24/0x30
[ 24.372255] [] dump_stack+0xa0/0xc8
[ 24.379505] [] panic+0x12c/0x2a8
[ 24.386485] [] watchdog_check_hardlockup_other_cpu+0x11c/0x120
[ 24.396072] [] watchdog_timer_fn+0x98/0x2c0
[ 24.404001] [] __hrtimer_run_queues+0xd8/0x360
[ 24.412184] [] hrtimer_interrupt+0xa8/0x1e0
[ 24.420082] [] tegra210_timer_isr+0x38/0x48
[ 24.427962] [] __handle_irq_event_percpu+0x68/0x288
[ 24.436531] [] handle_irq_event_percpu+0x28/0x60
[ 24.444851] [] handle_irq_event+0x50/0x80
[ 24.452528] [] handle_fasteoi_irq+0xd4/0x1c0
[ 24.460446] [] generic_handle_irq+0x34/0x50
[ 24.468262] [] __handle_domain_irq+0x68/0xc0
[ 24.476170] [] gic_handle_irq+0x5c/0xb0
[ 24.483632] [] el1_irq+0xe8/0x194
[ 24.490542] [] cpuidle_enter_state+0xb8/0x380
[ 24.498497] [] cpuidle_enter+0x34/0x48
[ 24.505845] [] call_cpuidle+0x44/0x70
[ 24.513077] [] cpu_startup_entry+0x1b0/0x200
[ 24.520904] [] secondary_start_kernel+0x190/0x1f8
[ 24.529160] [<0000000084f671a4>] 0x84f671a4
[ 24.535338] SMP: stopping secondary CPUs
[ 24.541294] Kernel Offset: disabled
[ 24.546746] Memory Limit: none
[ 24.556619] Rebooting in 5 seconds…
To verify, if the Board, Adapter or the PCIe card is defect, I changed the Nano against a Xavier NX which worked perfectly fine.
I then updated the Kernel of the Nano to
R32 (release), REVISION: 5.1, GCID: 27362550, BOARD: t210ref, EABI: aarch64, DATE: Wed May 19 18:07:59 UTC 2021
and nothing changed.
So to get further information, I tried with lots of different PCIe cards with the Nano, which showed me the following result:
- FireWire PCIe 1x works fine
- Proprietary data card PCIe 1x works fine
- 1x GigE Realtek works fine
- 1x GigE i310 does not work → but works with Xavier NX
After finding some hints about problems with Intel chips I decided to get some 4x GigE Realtek card, which again works fine with the Xavier NX:
me@jetson-xavier-jnx30-1:~$ lspci
0005:00:00.0 PCI bridge: NVIDIA Corporation Device 1ad0 (rev a1)
0005:01:00.0 PCI bridge: Pericom Semiconductor PI7C9X2G608GP PCIe2 6-Port/8-Lane Packet Switch
0005:02:01.0 PCI bridge: Pericom Semiconductor PI7C9X2G608GP PCIe2 6-Port/8-Lane Packet Switch
0005:02:02.0 PCI bridge: Pericom Semiconductor PI7C9X2G608GP PCIe2 6-Port/8-Lane Packet Switch
0005:02:03.0 PCI bridge: Pericom Semiconductor PI7C9X2G608GP PCIe2 6-Port/8-Lane Packet Switch
0005:02:04.0 PCI bridge: Pericom Semiconductor PI7C9X2G608GP PCIe2 6-Port/8-Lane Packet Switch
0005:03:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 0c)
0005:04:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 0c)
0005:05:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 0c)
0005:06:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 0c)
But with the Nano there is still a problem:
[ 1.287303] pcieport 0000:00:01.0: device [10de:0fae] error status/mask=00000001/00002000
** 9072 printk messages dropped ** [ 1.431519] pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=0008(Receiver ID)
** 574 printk messages dropped ** [ 1.469868] pcieport 0000:00:01.0: [ 0] Receiver Error (First)
** 422 printk messages dropped ** [ 1.479877] pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=0008(Receiver ID)
** 573 printk messages dropped ** [ 1.494127] pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=0008(Receiver ID)
** 534 printk messages dropped ** [ 1.506196] pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=0008(Receiver ID)
** 475 printk messages dropped ** [ 1.517215] pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=0008(Receiver ID)
** 510 printk messages dropped ** [ 1.529822] pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=0008(Receiver ID)
** 570 printk messages dropped ** [ 1.545714] pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=0008(Receiver ID)
** 545 printk messages dropped ** [ 1.559770] pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=0008(Receiver ID)
** 540 printk messages dropped ** [ 1.572030] pcieport 0000:00:01.0: device [10de:0fae] error status/mask=00000001/00002000
** 477 printk messages dropped ** [ 1.583360] pcieport 0000:00:01.0: [ 0] Receiver Error (First)
** 398 printk messages dropped ** [ 1.593364] pcieport 0000:00:01.0: [ 0] Receiver Error (First)
** 327 printk messages dropped ** [ 1.600892] pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=0008(Receiver ID)
** 461 printk messages dropped ** [ 1.612164] pcieport 0000:00:01.0: [ 0] Receiver Error (First)
** 417 printk messages dropped ** [ 1.623898] pcieport 0000:00:01.0: [ 0] Receiver Error (First)
** 407 printk messages dropped ** [ 1.634818] pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=0008(Receiver ID)
** 510 printk messages dropped ** [ 1.648037] pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=0008(Receiver ID)
** 498 printk messages dropped ** [ 1.659942] pcieport 0000:00:01.0: [ 0] Receiver Error (First)
** 310 printk messages dropped ** [ 1.667813] pcieport 0000:00:01.0: device [10de:0fae] error status/mask=00000001/00002000
** 492 printk messages dropped ** [ 1.680761] pcieport 0000:00:01.0: [ 0] Receiver Error (First)
** 350 printk messages dropped ** [ 1.690667] pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=0008(Receiver ID)
** 448 printk messages dropped ** [ 1.702872] pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=0008(Receiver ID)
** 505 printk messages dropped ** [ 1.715880] pcieport 0000:00:01.0: [ 0] Receiver Error (First)
** 354 printk messages dropped ** [ 1.725966] pcieport 0000:00:01.0: device [10de:0fae] error status/mask=00000001/00002000
** 391 printk messages dropped ** [ 1.736423] pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=0008(Receiver ID)
** 372 printk messages dropped ** [ 1.746976] pcieport 0000:00:01.0: device [10de:0fae] error status/mask=00000001/00002000
** 392 printk messages dropped ** [ 1.757362] pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=0008(Receiver ID)
** 504 printk messages dropped ** [ 1.774217] pcieport 0000:00:01.0: device [10de:0fae] error status/mask=00000001/00002000
** 390 printk messages dropped ** [ 1.785908] pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=0008(Receiver ID)
** 469 printk messages dropped ** [ 1.798420] pcieport 0000:00:01.0: [ 0] Receiver Error (First)
How can I fix this problem?
Can anyone help please?
Has anyone a working solution for 4x GigE with the Nano?