NVIDIA Jetson TX2
L4T 28.2.1 [ JetPack 3.3 or 3.2.1 ]
Board: t186ref
Ubuntu 16.04 LTS
Kernel Version: 4.4.38-tegra
I’m having some issues loading a PCI kernel module during startup on a Jetson TX2. When our system boots, the kernel has 3 attempts to load, and the third time is successful.
I have located the problem to be due to our PCI driver. The module init function gets called, however, after the call to pci_register_driver(), the probe() function does not. Instead there are lots of errors reported, eventually culminating in the message “Fixing recursive fault but reboot is needed!”. This repeats, then on the third attempt the module load is successful (probe() function called) and the kernel load completes.
Some of the kernel log, immediately after the pci_register_driver() call is as follows :
[ 7.756180] xhci-tegra 3530000.xhci: Firmware timestamp: 2017-12-07 10:50:08 UTC, Version: 55.09 release
[ 7.769348] xhci-tegra 3530000.xhci: xHCI Host Controller
[ 7.775566] xhci-tegra 3530000.xhci: new USB bus registered, assigned bus number 1
[ 7.785485] xhci-tegra 3530000.xhci: hcc params 0x0184fd25 hci version 0x100 quirks 0x00010810
[ 7.795725] xhci-tegra 3530000.xhci: irq 60, io mem 0x03530000
[ 7.802469] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002
[ 7.810166] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[ 7.810171] usb usb1: Product: xHCI Host Controller
[ 7.810173] usb usb1: Manufacturer: Linux 4.4.38-tegra xhci-hcd
[ 7.810175] usb usb1: SerialNumber: 3530000.xhci
[ 7.810623] hub 1-0:1.0: USB hub found
[ 7.810641] hub 1-0:1.0: 4 ports detected
[ 7.826478] dhd_module_init in
[ 7.826509] tegra_net_perf_init: cannot get wifi sclk
[ 7.826581] found wifi platform device bcmdhd_wlan
[ 7.827957] wifi_platform_get_country_code_map: could not get country_code_map
[ 7.827959] wifi_plat_dev_drv_probe:platform country code map is not available
[ 7.827972] Power-up adapter 'DHD generic adapter'
[ 7.827977] wifi_platform_set_power = 1
[ 7.834340] xhci-tegra 3530000.xhci: xHCI Host Controller
[ 7.834347] xhci-tegra 3530000.xhci: new USB bus registered, assigned bus number 2
[ 7.834460] usb usb2: New USB device found, idVendor=1d6b, idProduct=0003
[ 7.834462] usb usb2: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[ 7.834464] usb usb2: Product: xHCI Host Controller
[ 7.834466] usb usb2: Manufacturer: Linux 4.4.38-tegra xhci-hcd
[ 7.834467] usb usb2: SerialNumber: 3530000.xhci
[ 7.834777] hub 2-0:1.0: USB hub found
[ 7.834799] hub 2-0:1.0: 3 ports detected
[ 7.835144] tegra-xotg xotg: otg: host 3530000.xhci registered
[ 7.934622] xhci-tegra 3530000.xhci: entering ELPG
[ 7.937239] xhci-tegra 3530000.xhci: entering ELPG done
[ 8.028372] tegra-pcie 10003000.pcie-controller: link 2 down, retrying
[ 8.030628] wifi_platform_bus_enumerate device present 1
[ 8.045897] tegra-pcie 10003000.pcie-controller: link 2 down, ignoring
[ 8.046298] tegra-pcie 10003000.pcie-controller: PCI host bridge to bus 0000:00
[ 8.046343] pci_bus 0000:00: root bus resource [mem 0x50100000-0x57ffffff]
[ 8.046348] pci_bus 0000:00: root bus resource [mem 0x58000000-0x7fffffff pref]
[ 8.046373] pci_bus 0000:00: root bus resource [bus 00-ff]
[ 8.046376] pci_bus 0000:00: root bus resource [io 0x1000-0xffff]
[ 8.046408] pci 0000:00:01.0: [10de:10e5] type 01 class 0x060400
[ 8.046513] pci 0000:00:01.0: PME# supported from D0 D1 D2 D3hot D3cold
[ 8.046657] iommu: Adding device 0000:00:01.0 to group 56
[ 8.046759] pci 0000:00:01.0: bridge configuration invalid ([bus 00-00]), reconfiguring
[ 8.046892] pci 0000:01:00.0: [1172:e003] type 00 class 0x058000
[ 8.046939] pci 0000:01:00.0: reg 0x10: [mem 0x00000000-0x000001ff 64bit pref]
[ 8.046960] pci 0000:01:00.0: reg 0x18: [mem 0x00000000-0x0000ffff 64bit pref]
[ 8.046977] pci 0000:01:00.0: reg 0x20: [mem 0x00000000-0x000fffff 64bit pref]
[ 8.047169] iommu: Adding device 0000:01:00.0 to group 57
[ 8.047218] pci_bus 0000:01: busn_res: [bus 01-ff] end is updated to 01
[ 8.047257] pci 0000:00:01.0: BAR 9: assigned [mem 0x58000000-0x581fffff 64bit pref]
[ 8.047264] pci 0000:01:00.0: BAR 4: assigned [mem 0x58000000-0x580fffff 64bit pref]
[ 8.047281] pci 0000:01:00.0: BAR 2: assigned [mem 0x58100000-0x5810ffff 64bit pref]
[ 8.047297] pci 0000:01:00.0: BAR 0: assigned [mem 0x58110000-0x581101ff 64bit pref]
[ 8.047313] pci 0000:00:01.0: PCI bridge to [bus 01]
[ 8.047323] pci 0000:00:01.0: bridge window [mem 0x58000000-0x581fffff 64bit pref]
[ 8.048543] pcieport 0000:00:01.0: enabling device (0000 -> 0002)
[ 8.048650] pcieport 0000:00:01.0: Signaling PME through PCIe PME interrupt
[ 8.048652] pci 0000:01:00.0: Signaling PME through PCIe PME interrupt
[ 8.048658] pcie_pme 0000:00:01.0:pcie01: service driver pcie_pme loaded
[ 8.048764] aer 0000:00:01.0:pcie02: service driver aer loaded
[ 8.048893] Bad mode in Synchronous Abort handler detected, code 0x86000007 -- IABT (current EL)
[ 8.048898] Internal error: Oops - bad mode: 0 [#1] PREEMPT SMP
[ 8.048911] Modules linked in: bcmdhd(+) gpu_beamformer(O) pci_tegra uio_pdrv_genirq uio bluedroid_pm
[ 8.048918] CPU: 1 PID: 15 Comm: kworker/1:0 Tainted: G O 4.4.38-tegra #23
[ 8.048919] Hardware name: quill (DT)
[ 8.048932] Workqueue: events pcie_delayed_detect [pci_tegra]
[ 8.048935] task: ffffffc1ecdef080 ti: ffffffc1ece1c000 task.ti: ffffffc1ece1c000
[ 8.048937] PC is at 0xffffffbffc0dc000
[ 8.048946] LR is at pci_device_probe+0x98/0x10c
[ 8.048949] pc : [<ffffffbffc0dc000>] lr : [<ffffffc0003900d0>] pstate: 60000045
[ 8.048951] sp : ffffffc1ece1fa80
[ 8.048956] x29: ffffffc1ece1fa80 x28: 0000000000000100
[ 8.048959] x27: ffffffc1e4d42200 x26: ffffffbffc0c2db8
[ 8.048962] x25: 0000000000000017 x24: ffffffbffc0d94a8
[ 8.048966] x23: ffffffbffc0d93a8 x22: ffffffbffc0d9410
[ 8.048969] x21: 0000000000000000 x20: ffffffc1dc886800
[ 8.048972] x19: ffffffc1dc886890 x18: 0000000000000000
[ 8.048975] x17: 0000007fa6bb9df8 x16: ffffffc000b62a60
[ 8.048979] x15: 0a01000a02020501 x14: 0ffffffffffffffe
[ 8.048983] x13: ffffffff00000000 x12: 0000000000000030
[ 8.048986] x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f
[ 8.048989] x9 : feff716475687163 x8 : ffffffffffffffff
[ 8.048992] x7 : ffffffc1e5b8c798 x6 : ffffffc1e5b8c798
[ 8.048996] x5 : ffffffc1e5b8c799 x4 : 0000000000000000
[ 8.048999] x3 : ffffffc1ece1f9c0 x2 : ffffffbffc0dc000
[ 8.049002] x1 : ffffffbffc0d94a8 x0 : ffffffc1dc886800
[ 8.049003]
[ 8.049006] Process kworker/1:0 (pid: 15, stack limit = 0xffffffc1ece1c020)
[ 8.049007] Call trace:
[ 8.049032] [<ffffffbffc0dc000>] 0xffffffbffc0dc000
[ 8.049038] [<ffffffc000575344>] driver_probe_device+0xc8/0x408
[ 8.049041] [<ffffffc0005757dc>] __device_attach_driver+0x8c/0xb0
[ 8.049045] [<ffffffc000573458>] bus_for_each_drv+0x54/0x94
[ 8.049048] [<ffffffc00057516c>] __device_attach+0xc0/0x150
[ 8.049050] [<ffffffc00057520c>] device_attach+0x10/0x18
[ 8.049055] [<ffffffc000385bf8>] pci_bus_add_device+0x3c/0x60
[ 8.049057] [<ffffffc000386014>] pci_bus_add_devices+0x44/0x90
[ 8.049059] [<ffffffc000386040>] pci_bus_add_devices+0x70/0x90
[ 8.049065] [<ffffffc0000941bc>] pci_common_init_dev+0x2d4/0x394
[ 8.049072] [<ffffffbffc0bde54>] pcie_delayed_detect+0x894/0xa80 [pci_tegra]
[ 8.049076] [<ffffffc0000bc2d0>] process_one_work+0x154/0x434
[ 8.049079] [<ffffffc0000bc6e4>] worker_thread+0x134/0x40c
[ 8.049082] [<ffffffc0000c1f30>] kthread+0xe0/0xf4
[ 8.049089] [<ffffffc000084f90>] ret_from_fork+0x10/0x40
[ 8.049324] wifi_platform_bus_enumerate device present 0
[ 8.049457] ---[ end trace 0394ab4bd3dfd138 ]---
[ 8.051204] Mass Storage Function, version: 2009/09/11
[ 8.051208] LUN: removable file: (no medium)
[ 8.051380] ------------[ cut here ]------------
[ 8.051383] WARNING: at ffffffc0000a9074 [verbose debug info unavailable]
[ 8.051394] Modules linked in: bcmdhd(+) gpu_beamformer(O) pci_tegra uio_pdrv_genirq uio bluedroid_pm
[ 8.051395]
[ 8.051401] CPU: 1 PID: 15 Comm: kworker/1:0 Tainted: G D O 4.4.38-tegra #23
[ 8.051402] Hardware name: quill (DT)
[ 8.051415] task: ffffffc1ecdef080 ti: ffffffc1ece1c000 task.ti: ffffffc1ece1c000
[ 8.051423] PC is at __local_bh_enable_ip+0x68/0xb8
[ 8.051429] LR is at _raw_spin_unlock_bh+0x20/0x28
[ 8.051431] pc : [<ffffffc0000a9074>] lr : [<ffffffc000b57418>] pstate: 400001c5
[ 8.051432] sp : ffffffc1ece1f780
[ 8.051437] x29: ffffffc1ece1f780 x28: ffffffc1ece1c000
[ 8.051440] x27: ffffffc1e4d42200 x26: ffffffc1ecdef080
[ 8.051443] x25: ffffffc1ecdef080 x24: 00000000000001c0
[ 8.051446] x23: 0000000000000001 x22: 0000000000000000
[ 8.051449] x21: ffffffc000f4bd48 x20: ffffffc1ecdef080
[ 8.051452] x19: ffffffc001456858 x18: 0000000000000000
[ 8.051455] x17: 0000007fa6bb9df8 x16: ffffffc000b62a60
[ 8.051458] x15: 0a01000a02020501 x14: 6563653163666666
[ 8.051461] x13: 666666203a69742e x12: 0000000000000000
[ 8.051464] x11: 0101010101010101 x10: ffffffffffff7f7f
[ 8.051468] x9 : fefefec0db8461ff x8 : ffffffc1ecdef8a0
[ 8.051471] x7 : 09432c3334313738 x6 : ffffffc001466000
[ 8.051474] x5 : ffffffc00126d560 x4 : ffffffc001252050
[ 8.051477] x3 : ffffffc00126d418 x2 : 0000000000000000
[ 8.051480] x1 : 0000000000000201 x0 : ffffffc0013d2000
[ 8.051481]
[ 8.051753] ---[ end trace 0394ab4bd3dfd139 ]---
[ 8.051754] Call trace:
[ 8.051758] [<ffffffc0000a9074>] __local_bh_enable_ip+0x68/0xb8
[ 8.051782] [<ffffffc000b57418>] _raw_spin_unlock_bh+0x20/0x28
[ 8.051788] [<ffffffc00012c81c>] cgroup_exit+0x58/0xe4
[ 8.051791] [<ffffffc0000a7880>] do_exit+0x2a4/0x9a8
[ 8.051796] [<ffffffc000089bd8>] bug_handler.part.3+0x0/0x7c
[ 8.051801] [<ffffffc000089c9c>] arm64_notify_die+0x1c/0x58
[ 8.051804] [<ffffffc000089ef0>] bad_mode+0x84/0x90
[ 8.051809] [<ffffffc0003900d0>] pci_device_probe+0x98/0x10c
[ 8.051813] [<ffffffc000575344>] driver_probe_device+0xc8/0x408
[ 8.051816] [<ffffffc0005757dc>] __device_attach_driver+0x8c/0xb0
[ 8.051820] [<ffffffc000573458>] bus_for_each_drv+0x54/0x94
[ 8.051823] [<ffffffc00057516c>] __device_attach+0xc0/0x150
[ 8.051825] [<ffffffc00057520c>] device_attach+0x10/0x18
[ 8.051829] [<ffffffc000385bf8>] pci_bus_add_device+0x3c/0x60
[ 8.051831] [<ffffffc000386014>] pci_bus_add_devices+0x44/0x90
[ 8.051834] [<ffffffc000386040>] pci_bus_add_devices+0x70/0x90
[ 8.051839] [<ffffffc0000941bc>] pci_common_init_dev+0x2d4/0x394
[ 8.051847] [<ffffffbffc0bde54>] pcie_delayed_detect+0x894/0xa80 [pci_tegra]
[ 8.051850] [<ffffffc0000bc2d0>] process_one_work+0x154/0x434
[ 8.051852] [<ffffffc0000bc6e4>] worker_thread+0x134/0x40c
[ 8.051856] [<ffffffc0000c1f30>] kthread+0xe0/0xf4
[ 8.051860] [<ffffffc000084f90>] ret_from_fork+0x10/0x40
[ 8.051938] Unable to handle kernel paging request at virtual address ffffffffffffffd8
[ 8.051945] pgd = ffffffc07d453000
[ 8.051959] [ffffffffffffffd8] *pgd=00000000fee5a003, *pud=00000000fee5a003, *pmd=0000000000000000
[ 8.051964] Internal error: Oops: 96000005 [#2] PREEMPT SMP
[ 8.051989] Modules linked in: bcmdhd(+) gpu_beamformer(O) pci_tegra uio_pdrv_genirq uio bluedroid_pm
[ 8.051995] CPU: 1 PID: 15 Comm: kworker/1:0 Tainted: G D W O 4.4.38-tegra #23
[ 8.051999] Hardware name: quill (DT)
[ 8.052013] task: ffffffc1ecdef080 ti: ffffffc1ece1c000 task.ti: ffffffc1ece1c000
[ 8.052018] PC is at kthread_data+0x4/0xc
[ 8.052021] LR is at wq_worker_sleeping+0x10/0xc4
[ 8.052026] pc : [<ffffffc0000c25f8>] lr : [<ffffffc0000bd190>] pstate: 600000c5
[ 8.052029] sp : ffffffc1ece1f750
[ 8.052042] x29: ffffffc1ece1f750 x28: ffffffc1ece1c000
[ 8.052052] x27: ffffffc1e4d42200 x26: ffffffc001252000
[ 8.052062] x25: 0000000000000001 x24: ffffffc000b53aa0
[ 8.052072] x23: 0000000000000000 x22: ffffffc1ecdef610
[ 8.052082] x21: ffffffc001223000 x20: ffffffc1ecdef080
[ 8.052092] x19: ffffffc1f66fe580 x18: ffffffc000bd5120
[ 8.052102] x17: 000000000000000e x16: 0000000000000007
[ 8.052112] x15: 0000000000000001 x14: 0000000000000007
[ 8.052121] x13: 000000000000000e x12: 0000000000000013
[ 8.052131] x11: 000000000000001a x10: 000000000044a850
[ 8.052140] x9 : 000000000044a850 x8 : 0000000000000400
[ 8.052150] x7 : 0000000000000000 x6 : 000000000284a84f
[ 8.052160] x5 : ffffffc1f66fe580 x4 : ffffffc1f66fef60
[ 8.052170] x3 : 000000000001af3b x2 : ffffffc1f66dd000
[ 8.052177] x1 : 0000000000000001 x0 : 0000000000000000
[ 8.052178]
[ 8.052180] Process kworker/1:0 (pid: 15, stack limit = 0xffffffc1ece1c020)
[ 8.052181] Call trace:
[ 8.052184] [<ffffffc0000c25f8>] kthread_data+0x4/0xc
[ 8.052187] [<ffffffc000b536d0>] __schedule+0x348/0x6d4
[ 8.052190] [<ffffffc000b53aa0>] schedule+0x44/0xa8
[ 8.052192] [<ffffffc0000a7b94>] do_exit+0x5b8/0x9a8
[ 8.052195] [<ffffffc000089bd8>] bug_handler.part.3+0x0/0x7c
[ 8.052198] [<ffffffc000089c9c>] arm64_notify_die+0x1c/0x58
[ 8.052200] [<ffffffc000089ef0>] bad_mode+0x84/0x90
[ 8.052203] [<ffffffc0003900d0>] pci_device_probe+0x98/0x10c
[ 8.052206] [<ffffffc000575344>] driver_probe_device+0xc8/0x408
[ 8.052208] [<ffffffc0005757dc>] __device_attach_driver+0x8c/0xb0
[ 8.052213] [<ffffffc000573458>] bus_for_each_drv+0x54/0x94
[ 8.052216] [<ffffffc00057516c>] __device_attach+0xc0/0x150
[ 8.052234] [<ffffffc00057520c>] device_attach+0x10/0x18
[ 8.052237] [<ffffffc000385bf8>] pci_bus_add_device+0x3c/0x60
[ 8.052239] [<ffffffc000386014>] pci_bus_add_devices+0x44/0x90
[ 8.052241] [<ffffffc000386040>] pci_bus_add_devices+0x70/0x90
[ 8.052244] [<ffffffc0000941bc>] pci_common_init_dev+0x2d4/0x394
[ 8.052250] [<ffffffbffc0bde54>] pcie_delayed_detect+0x894/0xa80 [pci_tegra]
[ 8.052252] [<ffffffc0000bc2d0>] process_one_work+0x154/0x434
[ 8.052254] [<ffffffc0000bc6e4>] worker_thread+0x134/0x40c
[ 8.052257] [<ffffffc0000c1f30>] kthread+0xe0/0xf4
[ 8.052259] [<ffffffc000084f90>] ret_from_fork+0x10/0x40
[ 8.052392] ---[ end trace 0394ab4bd3dfd13a ]---
[ 8.054399] F1 signature read @0x18000000=0x17214354
[ 8.054400] Fixing recursive fault but reboot is needed!
[ 36.074307] Watchdog detected hard LOCKUP on cpu 3
I discovered that if the module load is delayed for 15s, or so, there are no issues, however I think this problem is something that should be understood and resolved.
Any help you may provide is greatly appreciated.
Thanks.