kernel randomly IABT error

Hi everyone
We encounter a problem that randomly happen as follows:
description:
1. we are using tx2 jetpack3.0 L4T r27.1
2. our device using usbmap config#3
3. a pcie device tw6869 is connected to pcie port
4. we got the driver from: https://github.com/sasamy/tw6869.git, branch linux-4.x
5. when the system boot , we install the driver and start an application to collection the 8 channel data by DMA
6. we found that there’s a probability of 7 percent the system will hangs and reboot
7. some times there are some back trace log from kernel but more often nothing is printed out by kernel
8. in the back trace log we found IABT error ,but every time the log is different.

Can someone give us some advice on debugging? thanks a lot!

crash log 01

[  360.164688] ------------[ cut here ]------------
[  360.169314] WARNING: at ffffffc000137e04 [verbose debug info unavailable]
[  360.176473] ---[ end trace 4c16bc62675417b9 ]---
[  360.181086] Call trace:
[  421.416823] Bad mode in Synchronous Abort handler detected, code 0x86000006 -- IABT (current EL)
[  421.416832] Bad mode in Synchronous Abort handler detected, code 0x86000006 -- IABT (current EL)
[  421.416843] Internal error: Oops - bad mode: 0 [#1] PREEMPT SMP
[  421.416870] Modules linked in: tw6869(O) fuse bnep bluetooth bcmdhd pci_tegra bluedroid_pm
[  421.416882] CPU: 4 PID: 2104 Comm: v4l2_tester Tainted: G        W  O    4.4.15 #8
[  421.416887] Hardware name: quill (DT)
[  421.416895] task: ffffffc06f68b200 ti: ffffffc1b9368000 task.ti: ffffffc1b9368000
[  421.416901] PC is at 0x0
[  421.416949] LR is at stop_streaming+0x2c/0x1bc [tw6869]
[  421.416955] pc : [<0000000000000000>] lr : [<ffffffbffd0364ec>] pstate: 600000c5
[  421.416958] sp : ffffffc1b936bb40
[  421.416967] x29: ffffffc1b936bb40 x28: ffffffc0011bb8b0 
[  421.416976] x27: 0000000000000001 x26: ffffffc1b936bd30 
[  421.416984] x25: 0000000000000000 x24: ffffffbffd038150 
[  421.416991] x23: ffffffc1efb3c500 x22: ffffffc1bece31c0 
[  421.416998] x21: ffffffc1bece2c58 x20: ffffffc1bece2d58 
[  421.417006] x19: ffffffc1bece31c0 x18: 0000000000000a03 
[  421.417013] x17: 0000007fb2fe4290 x16: ffffffc0001cf440 
[  421.417020] x15: 0000007fb30b1cc0 x14: 0000000000000000 
[  421.417027] x13: 0000000000000000 x12: 00000000000003f3 
[  421.417034] x11: 0000000000000018 x10: 00000000000008a0 
[  421.417041] x9 : ffffffc1b936b9e0 x8 : ffffffc06f68bb00 
[  421.417048] x7 : 00000000000000b0 x6 : 00000000000089d4 
[  421.417055] x5 : 0000000000000000 x4 : 00000000000005a5 
[  421.417061] x3 : 0000000000000003 x2 : 0000000000000040 
[  421.417068] x1 : 0000000005a605a5 x0 : ffffffc1bece0028 
[  421.417070] 
[  421.417076] Process v4l2_tester (pid: 2104, stack limit = 0xffffffc1b9368020)
[  421.417078] Call trace:
[  421.417083] [<          (null)>]           (null)
[  421.417103] [<ffffffc0006e8dfc>] __vb2_queue_cancel+0x30/0x13c
[  421.417116] [<ffffffc0006ea274>] vb2_core_streamoff+0x48/0xb8
[  421.417126] [<ffffffc0006ec2e4>] vb2_streamoff+0x3c/0x60
[  421.417132] [<ffffffc0006ec358>] vb2_ioctl_streamoff+0x50/0x5c
[  421.417146] [<ffffffc0006d286c>] v4l_streamoff+0x1c/0x24
[  421.417157] [<ffffffc0006d675c>] __video_do_ioctl+0x224/0x298
[  421.417167] [<ffffffc0006d61d4>] video_usercopy+0x22c/0x574
[  421.417176] [<ffffffc0006d6530>] video_ioctl2+0x14/0x1c
[  421.417184] [<ffffffc0006d1634>] v4l2_ioctl+0xbc/0xcc
[  421.417197] [<ffffffc0001cf180>] do_vfs_ioctl+0x324/0x5e4
[  421.417204] [<ffffffc0001cf4c4>] SyS_ioctl+0x84/0x98
[  421.417215] [<ffffffc000084e70>] el0_svc_naked+0x24/0x28
[  421.417224] ---[ end trace 4c16bc62675417ba ]---
[  421.421542] note: v4l2_tester[2104] exited with preempt_count 1
[  421.421600] INFO: rcu_preempt self-detected stall on CPU
[  421.421614]  4-...: (1 GPs behind) idle=13b/140000000000001/0 softirq=8945/8956 fqs=3 
[  421.421624]   (t=21435 jiffies g=5140 c=5139 q=68)
[  421.421633] rcu_preempt kthread starved for 21193 jiffies! g5140 c5139 f0x0 s3 ->state=0x1
[  421.421638] Task dump for CPU 0:
[  421.421652] swapper/0       R  running task        0     0      0 0x00000002
[  421.421655] Call trace:
[  421.421669] [<ffffffc000085db8>] __switch_to+0xa4/0xb0
[  421.421687] [<ffffffc0010ffea0>] init_thread_union+0x3ea0/0x4000
[  421.421690] Task dump for CPU 4:
[  421.421702] v4l2_tester     R  running task        0  2104   1611 0x00000000
[  421.421704] Call trace:
[  421.421716] [<ffffffc000088fd8>] dump_backtrace+0x0/0x100
[  421.421725] [<ffffffc0000891a0>] show_stack+0x14/0x1c
[  421.421741] [<ffffffc0000c9cc8>] sched_show_task+0xa8/0xfc
[  421.421751] [<ffffffc0000cbff4>] dump_cpu_task+0x40/0x4c
[  421.421761] [<ffffffc0000f9864>] rcu_dump_cpu_stacks+0x94/0xe4
[  421.421770] [<ffffffc0000fd7a4>] rcu_check_callbacks+0x4fc/0xaa0
[  421.421780] [<ffffffc00010223c>] update_process_times+0x3c/0x6c
[  421.421791] [<ffffffc000111180>] tick_sched_handle.isra.16+0x20/0x78
[  421.421797] [<ffffffc00011121c>] tick_sched_timer+0x44/0x7c
[  421.421805] [<ffffffc00010296c>] __hrtimer_run_queues+0x140/0x350
[  421.421814] [<ffffffc0001033cc>] hrtimer_interrupt+0x9c/0x1e0
[  421.421823] [<ffffffc000842b14>] tegra186_timer_isr+0x24/0x30
[  421.421836] [<ffffffc0000f02f4>] handle_irq_event_percpu+0x84/0x290
[  421.421842] [<ffffffc0000f0544>] handle_irq_event+0x44/0x74
[  421.421851] [<ffffffc0000f384c>] handle_fasteoi_irq+0xb4/0x188
[  421.421860] [<ffffffc0000ef914>] generic_handle_irq+0x24/0x38
[  421.421869] [<ffffffc0000efc1c>] __handle_domain_irq+0x60/0xb4
[  421.421875] [<ffffffc0000815dc>] gic_handle_irq+0x5c/0xb4
[  421.421882] [<ffffffc0000845e8>] el1_irq+0x68/0xd8
[  421.421897] [<ffffffc000a5e470>] __schedule+0x4a8/0x6d4
[  421.421904] [<ffffffc000a5e940>] preempt_schedule_common+0x28/0x48
[  421.421910] [<ffffffc000a5e9c4>] _cond_resched+0x34/0x3c
[  421.421924] [<ffffffc00019350c>] unmap_single_vma+0x3fc/0x57c
[  421.421930] [<ffffffc000193e2c>] unmap_vmas+0x58/0x70
[  421.421943] [<ffffffc00019bc04>] exit_mmap+0x88/0xfc
[  421.421957] [<ffffffc00009d6a0>] mmput+0x50/0xf4
[  421.421970] [<ffffffc0000a2094>] do_exit+0x234/0x9a0
[  421.421977] [<ffffffc000089330>] die+0x188/0x1a0
[  421.421986] [<ffffffc00008940c>] arm64_notify_die+0x1c/0x58
[  421.421994] [<ffffffc000089660>] bad_mode+0x84/0x90
[  421.422022] [<ffffffbffd0364ec>] stop_streaming+0x2c/0x1bc [tw6869]
[  421.422033] [<ffffffc0006e8dfc>] __vb2_queue_cancel+0x30/0x13c
[  421.422042] [<ffffffc0006ea274>] vb2_core_streamoff+0x48/0xb8
[  421.422049] [<ffffffc0006ec2e4>] vb2_streamoff+0x3c/0x60
[  421.422054] [<ffffffc0006ec358>] vb2_ioctl_streamoff+0x50/0x5c
[  421.422063] [<ffffffc0006d286c>] v4l_streamoff+0x1c/0x24
[  421.422071] [<ffffffc0006d675c>] __video_do_ioctl+0x224/0x298
[  421.422080] [<ffffffc0006d61d4>] video_usercopy+0x22c/0x574
[  421.422088] [<ffffffc0006d6530>] video_ioctl2+0x14/0x1c
[  421.422095] [<ffffffc0006d1634>] v4l2_ioctl+0xbc/0xcc
[  421.422104] [<ffffffc0001cf180>] do_vfs_ioctl+0x324/0x5e4
[  421.422110] [<ffffffc0001cf4c4>] SyS_ioctl+0x84/0x98
[  421.422117] [<ffffffc000084e70>] el0_svc_naked+0x24/0x28
[  421.955634] Internal error: Oops - bad mode: 0 [#2] PREEMPT SMP
[  421.961549] Modules linked in: tw6869(O) fuse bnep bluetooth bcmdhd pci_tegra bluedroid_pm
[  421.969942] CPU: 3 PID: 0 Comm: swapper/3 Tainted: G      D W  O    4.4.15 #8
[  421.977071] Hardware name: quill (DT)
[  421.980737] task: ffffffc1f68ca580 ti: ffffffc1f68dc000 task.ti: ffffffc1f68dc000
[  421.988215] PC is at 0x0
[  421.990757] LR is at t18x_a57_enter_state+0x20/0xc4
[  421.995633] pc : [<0000000000000000>] lr : [<ffffffc00086edac>] pstate: 800000c5
[  422.003021] sp : ffffffc1f68dfec0
[  422.006335] x29: ffffffc1f68dfec0 x28: ffffffc1f68dc000 
[  422.011673] x27: ffffffc000a6cf20 x26: 000000621c34b440 
[  422.017010] x25: ffffffc001263000 x24: 0000000000000000 
[  422.022343] x23: ffffffc0011f6b60 x22: ffffffc0011f6b78 
[  422.027678] x21: ffffffc001263188 x20: ffffffc00132bec8 
[  422.033013] x19: 0000000000000000 x18: 0000000000000014 
[  422.038347] x17: 0000007f7cb5ef68 x16: ffffffc0001d128c 
[  422.043682] x15: 001dcd6500000000 x14: 00020b4900000000 
[  422.049016] x13: ffffffffa3e740ed x12: 0000000000001000 
[  422.054351] x11: 0000000000001000 x10: 00000000000008a0 
[  422.059685] x9 : ffffffc1f68dfec0 x8 : ffffffc1f68cae80 
[  422.065019] x7 : 000000620a5d6000 x6 : 0000000000012d7a 
[  422.070352] x5 : 0000000000000000 x4 : 00ffffffffffffff 
[  422.075687] x3 : 000000003b9aca00 x2 : 00000000003ce940 
[  422.081021] x1 : 0000000000000000 x0 : 0000000000000000 
[  422.086354] 
[  422.087848] Process swapper/3 (pid: 0, stack limit = 0xffffffc1f68dc020)
[  422.094541] Call trace:
[  422.096989] [<          (null)>]           (null)
[  422.101703] [<ffffffc000736644>] cpuidle_enter_state+0x88/0x2dc
[  422.107623] [<ffffffc0007368d0>] cpuidle_enter+0x18/0x20
[  422.112940] [<ffffffc0000e3040>] call_cpuidle+0x28/0x50
[  422.118167] [<ffffffc0000e31e4>] cpu_startup_entry+0x17c/0x340
[  422.123999] [<ffffffc00008e5ec>] secondary_start_kernel+0x130/0x168
[  422.130263] [<000000008008192c>] 0x8008192c
[  422.134447] ---[ end trace 4c16bc62675417bb ]---
[  422.143038] Kernel panic - not syncing: Attempted to kill the idle task!
[  423.368610] SMP: failed to stop secondary CPUs
[  423.391121] Rebooting in 5 seconds..
[  429.614547] SMP: failed to stop secondary CPUs

crash log02

[   60.070708] ------------[ cut here ]------------
[   60.075360] WARNING: at ffffffc000137e04 [verbose debug info unavailable]
[   60.082283] ---[ end trace e52243c4af1c151b ]---
[   60.086922] Call trace:
[   60.094628] ------------[ cut here ]------------
[   60.099272] WARNING: at ffffffc000137e04 [verbose debug info unavailable]
[   60.106186] ---[ end trace e52243c4af1c151c ]---
[   60.110826] Call trace:
[  121.404453] Bad mode in Synchronous Abort handler detected, code 0x86000005 -- IABT (current EL)
[  121.404455] Bad mode in Synchronous Abort handler detected, code 0x86000005 -- IABT (current EL)
[  121.404459] Internal error: Oops - bad mode: 0 [#1] PREEMPT SMP
[  121.404466] Modules linked in: tw6869(O) fuse bnep bluetooth bcmdhd pci_tegra bluedroid_pm
[  121.404469] CPU: 5 PID: 0 Comm: swapper/5 Tainted: G        W  O    4.4.15 #8
[  121.404470] Hardware name: quill (DT)
[  121.404471] task: ffffffc1f68cbe80 ti: ffffffc1f68e4000 task.ti: ffffffc1f68e4000
[  121.404473] PC is at 0x0
[  121.404477] LR is at t18x_a57_enter_state+0x20/0xc4
[  121.404478] pc : [<0000000000000000>] lr : [<ffffffc00086edac>] pstate: 800000c5
[  121.404479] sp : ffffffc1f68e7ec0
[  121.404482] x29: ffffffc1f68e7ec0 x28: ffffffc1f68e4000 
[  121.404484] x27: ffffffc000a6cf20 x26: 0000001c530676be 
[  121.404486] x25: ffffffc001263000 x24: 0000000000000000 
[  121.404488] x23: ffffffc0011f6b60 x22: ffffffc0011f6b78 
[  121.404490] x21: ffffffc001263188 x20: ffffffc00132bec8 
[  121.404492] x19: 0000000000000000 x18: 0000000000000006 
[  121.404494] x17: 0000007fa8d1c2a8 x16: ffffffc00020027c 
[  121.404495] x15: 001dcd6500000000 x14: 000856a140000000 
[  121.404497] x13: ffffffffa3e74f97 x12: 0000000000000017 
[  121.404499] x11: 0000000000000038 x10: 00000000000008a0 
[  121.404501] x9 : ffffffc1f68e7ec0 x8 : ffffffc1f68cc780 
[  121.404503] x7 : 0000001c2bf5d700 x6 : 0000000000004dba 
[  121.404504] x5 : 0000000000000000 x4 : 00ffffffffffffff 
[  121.404506] x3 : 000000003b9aca00 x2 : 00000000003cca24 
[  121.404508] x1 : 0000000000000000 x0 : 0000000000000000 
[  121.404508] 
[  121.404510] Process swapper/5 (pid: 0, stack limit = 0xffffffc1f68e4020)
[  121.404511] Call trace:
[  121.404512] [<          (null)>]           (null)
[  121.404515] [<ffffffc000736644>] cpuidle_enter_state+0x88/0x2dc
[  121.404518] [<ffffffc0007368d0>] cpuidle_enter+0x18/0x20
[  121.404520] [<ffffffc0000e3040>] call_cpuidle+0x28/0x50
[  121.404523] [<ffffffc0000e31e4>] cpu_startup_entry+0x17c/0x340
[  121.404525] [<ffffffc00008e5ec>] secondary_start_kernel+0x130/0x168
[  121.404527] [<000000008008192c>] 0x8008192c
[  121.404529] ---[ end trace e52243c4af1c151d ]---
[  121.405633] Kernel panic - not syncing: Attempted to kill the idle task!
[  121.633150] Internal error: Oops - bad mode: 0 [#2] PREEMPT SMP
[  121.639757] Modules linked in: tw6869(O) fuse bnep bluetooth bcmdhd pci_tegra bluedroid_pm
[  121.649551] CPU: 3 PID: 0 Comm: swapper/3 Tainted: G      D W  O    4.4.15 #8
[  121.657442] Hardware name: quill (DT)
[  121.661866] task: ffffffc1f68ca580 ti: ffffffc1f68dc000 task.ti: ffffffc1f68dc000
[  121.670975] PC is at 0x0
[  121.674328] LR is at t18x_a57_enter_state+0x20/0xc4
[  121.680046] pc : [<0000000000000000>] lr : [<ffffffc00086edac>] pstate: 800000c5
[  121.689178] sp : ffffffc1f68dfec0
[  121.693379] x29: ffffffc1f68dfec0 x28: ffffffc1f68dc000 
[  121.699602] x27: ffffffc000a6cf20 x26: 0000001c530676be 
[  121.705829] x25: ffffffc001263000 x24: 0000000000000000 
[  121.712053] x23: ffffffc0011f6b60 x22: ffffffc0011f6b78 
[  121.718281] x21: ffffffc001263188 x20: ffffffc00132bec8 
[  121.724512] x19: 0000000000000000 x18: 0000000000000032 
[  121.730740] x17: 0000007fa607d2a8 x16: ffffffc00020027c 
[  121.736951] x15: 01c57557e5294612 x14: 0000000000000000 
[  121.743150] x13: 0000000000000000 x12: 0000000000000000 
[  121.749346] x11: 0000000000000400 x10: 00000000000008a0 
[  121.755518] x9 : ffffffc1f68dfec0 x8 : ffffffc1f68cae80 
[  121.761703] x7 : 0000001be9927860 x6 : 0000000000004dba 
[  121.767906] x5 : 0000000000000000 x4 : 00ffffffffffffff 
[  121.774109] x3 : 000000003b9aca00 x2 : 00000000003cca24 
[  121.780305] x1 : 0000000000000000 x0 : 0000000000000000 
[  121.786484] 
[  121.788798] Process swapper/3 (pid: 0, stack limit = 0xffffffc1f68dc020)
[  121.796340] Call trace:
[  121.799605] [<          (null)>]           (null)
[  121.805122] [<ffffffc000736644>] cpuidle_enter_state+0x88/0x2dc
[  121.811848] [<ffffffc0007368d0>] cpuidle_enter+0x18/0x20
[  121.817936] [<ffffffc0000e3040>] call_cpuidle+0x28/0x50
[  121.823916] [<ffffffc0000e31e4>] cpu_startup_entry+0x17c/0x340
[  121.830484] [<ffffffc00008e5ec>] secondary_start_kernel+0x130/0x168
[  121.837463] [<000000008008192c>] 0x8008192c
[  121.842362] ---[ end trace e52243c4af1c151e ]---
[  122.456408] SMP: failed to stop secondary CPUs
[  122.467628] Rebooting in 5 seconds..
[  128.521123] SMP: failed to stop secondary CPUs

crash log03

[   44.164193] NMI watchdog: BUG: soft lockup - CPU#3 stuck for 23s! [nvmap-bz:87]
[   48.164199] ------------[ cut here ]------------
[   48.168805] WARNING: at ffffffc000137e04 [verbose debug info unavailable]
[   48.175688] ---[ end trace f9762fc0c11ae088 ]---
[   48.180293] Call trace:
[   72.164190] NMI watchdog: BUG: soft lockup - CPU#3 stuck for 23s! [nvmap-bz:87]
[   91.416259] Bad mode in Synchronous Abort handler detected, code 0x86000006 -- IABT (current EL)
[   91.425027] Internal error: Oops - bad mode: 0 [#1] PREEMPT SMP
[   91.430932] Modules linked in: tw6869(O) vfat fat fuse bnep uvcvideo snd_usb_audio videobuf2_vmalloc snd_usbmidi_lib bluetooth bcmdhd pci_tegra bluedroid_pm
[   91.445028] CPU: 3 PID: 87 Comm: nvmap-bz Tainted: G        W  O L  4.4.15 #2
[   91.452147] Hardware name: quill (DT)
[   91.455799] task: ffffffc1f4c55780 ti: ffffffc1f4dc0000 task.ti: ffffffc1f4dc0000
[   91.463265] PC is at 0x0
[   91.465791] LR is at smp_call_function_many+0x250/0x2ec
[   91.471003] pc : [<0000000000000000>] lr : [<ffffffc0001163d4>] pstate: 80000045
[   91.478381] sp : ffffffc1f4dc3c80
[   91.481685] x29: ffffffc1f4dc3c80 x28: 0000000000000040 
[   91.487002] x27: ffffffc001100af0 x26: ffffffc001100af0 
[   91.492318] x25: ffffffc0010d5500 x24: 0000000000000000 
[   91.497633] x23: ffffffc00009b150 x22: 0000000000000001 
[   91.502948] x21: ffffffc1ffce7580 x20: ffffffc001101000 
[   91.508263] x19: ffffffc1ffce7588 x18: 0000000000000000 
[   91.513579] x17: 0000000000000000 x16: ffffffc000a6da80 
[   91.518894] x15: ffffffc000a6da80 x14: 00000000fa83b2da 
[   91.524210] x13: 0000000000000001 x12: 0000000000c1ac52 
[   91.529525] x11: 0000000000000000 x10: 00000000000008a0 
[   91.534840] x9 : ffffffc1f4dc3cd0 x8 : ffffffc1f4c56080 
[   91.540156] x7 : ffffffbdc8bf62e0 x6 : ffffffbdc8bf62a0 
[   91.545470] x5 : ffffffc1ffcb83b0 x4 : 0000000000000037 
[   91.550785] x3 : 0000000000000000 x2 : 0000000000000000 
[   91.556101] x1 : ffffffc1ffcb8398 x0 : 0000000000000003 
[   91.561415] 
[   91.562900] Process nvmap-bz (pid: 87, stack limit = 0xffffffc1f4dc0020)
[   91.569585] Call trace:
[   91.572023] [<          (null)>]           (null)
[   91.576716] [<ffffffc0001164b0>] smp_call_function+0x40/0x70
[   91.582361] [<ffffffc000116510>] on_each_cpu+0x30/0x80
[   91.587487] [<ffffffc0003a8678>] nvmap_inner_clean_cache_all+0x48/0x58
[   91.594001] [<ffffffc0003b5768>] pp_clean_cache+0xc0/0x154
[   91.599473] [<ffffffc0003b5c44>] nvmap_background_zero_thread+0x3a4/0x4ec
[   91.606246] [<ffffffc0000bcebc>] kthread+0xe0/0xf4
[   91.611025] [<ffffffc000084e10>] ret_from_fork+0x10/0x40
[   91.616325] ---[ end trace f9762fc0c11ae089 ]---
[   91.622468] note: nvmap-bz[87] exited with preempt_count 2
[   91.627945] ------------[ cut here ]------------
[   91.632549] WARNING: at ffffffc0000a452c [verbose debug info unavailable]
[   91.639320] Modules linked in: tw6869(O) vfat fat fuse bnep uvcvideo snd_usb_audio videobuf2_vmalloc snd_usbmidi_lib bluetooth bcmdhd pci_tegra bluedroid_pm
[   91.653415] 
[   91.654900] CPU: 3 PID: 87 Comm: nvmap-bz Tainted: G      D W  O L  4.4.15 #2
[   91.662019] Hardware name: quill (DT)
[   91.665670] task: ffffffc1f4c55780 ti: ffffffc1f4dc0000 task.ti: ffffffc1f4dc0000
[   91.673138] PC is at __local_bh_enable_ip+0x68/0xb8
[   91.678010] LR is at _raw_spin_unlock_bh+0x20/0x28
[   91.682788] pc : [<ffffffc0000a452c>] lr : [<ffffffc000a61ed4>] pstate: 400001c5
[   91.690166] sp : ffffffc1f4dc3990
[   91.693471] x29: ffffffc1f4dc3990 x28: 0000000000000040 
[   91.698787] x27: ffffffc001100af0 x26: ffffffc1f4c55780 
[   91.704102] x25: ffffffc1f4c55780 x24: 00000000000001c0 
[   91.709418] x23: 0000000000000001 x22: 0000000000000000 
[   91.714733] x21: ffffffc000e27d68 x20: ffffffc1f4c55780 
[   91.720049] x19: ffffffc0012e2458 x18: 0000000000000000 
[   91.725365] x17: 0000000000000000 x16: ffffffc000a6da80 
[   91.730680] x15: ffffffc000a6da80 x14: 0ffffffffffffffe 
[   91.735996] x13: 0000000000000000 x12: ffffffc001115000 
[   91.741311] x11: 0000000000000006 x10: ffffffc0011154a8 
[   91.746628] x9 : 0000000000000421 x8 : ffffffc1f4c55fa0 
[   91.751944] x7 : 0000000000000000 x6 : ffffff800048f241 
[   91.757259] x5 : ffffffc001119238 x4 : ffffffc001101048 
[   91.762575] x3 : ffffffc001119198 x2 : 0000000000000000 
[   91.767891] x1 : 0000000000000201 x0 : ffffffc00125d000 
[   91.773205] 
[   91.774689] ---[ end trace f9762fc0c11ae08a ]---
[   91.779293] Call trace:
[   91.781732] [<ffffffc0000a452c>] __local_bh_enable_ip+0x68/0xb8
[   91.787638] [<ffffffc000a61ed4>] _raw_spin_unlock_bh+0x20/0x28
[   91.793465] [<ffffffc0001272ac>] cgroup_exit+0x58/0xe4
[   91.798597] [<ffffffc0000a20fc>] do_exit+0x29c/0x9a0
[   91.803551] [<ffffffc000089330>] die+0x188/0x1a0
[   91.808158] [<ffffffc00008940c>] arm64_notify_die+0x1c/0x58
[   91.813718] [<ffffffc000089660>] bad_mode+0x84/0x90
[   91.818584] [<ffffffc0001163d4>] smp_call_function_many+0x250/0x2ec
[   91.824836] [<ffffffc0001164b0>] smp_call_function+0x40/0x70
[   91.830482] [<ffffffc000116510>] on_each_cpu+0x30/0x80
[   91.835608] [<ffffffc0003a8678>] nvmap_inner_clean_cache_all+0x48/0x58
[   91.842121] [<ffffffc0003b5768>] pp_clean_cache+0xc0/0x154
[   91.847593] [<ffffffc0003b5c44>] nvmap_background_zero_thread+0x3a4/0x4ec
[   91.854365] [<ffffffc0000bcebc>] kthread+0xe0/0xf4
[   91.859144] [<ffffffc000084e10>] ret_from_fork+0x10/0x40
[  120.152221] Watchdog detected hard LOCKUP on cpu 3
[  120.157021] ------------[ cut here ]------------
[  120.161859] WARNING: at ffffffc000137e04 [verbose debug info unavailable]
[  120.168654] Modules linked in: tw6869(O) vfat fat fuse bnep uvcvideo snd_usb_audio videobuf2_vmalloc snd_usbmidi_lib bluetooth bcmdhd pci_tegra bluedroid_pm
[  120.182995] 
[  120.184603] CPU: 2 PID: 0 Comm: swapper/2 Tainted: G      D W  O L  4.4.15 #2
[  120.191751] Hardware name: quill (DT)
[  120.195438] task: ffffffc1f68c9900 ti: ffffffc1f68d4000 task.ti: ffffffc1f68d4000
[  120.203040] PC is at watchdog_timer_fn+0x230/0x33c
[  120.207854] LR is at watchdog_timer_fn+0x230/0x33c
[  120.212662] pc : [<ffffffc000137e04>] lr : [<ffffffc000137e04>] pstate: 600001c5
[  120.220065] sp : ffffffc1f68d7af0
[  120.223394] x29: ffffffc1f68d7af0 x28: 0000000000000003 
[  120.228766] x27: ffffffc001100b28 x26: ffffffc1ffcd1260 
[  120.234135] x25: ffffffc0010d02c0 x24: ffffffc1f68d7dd0 
[  120.239506] x23: 0000000000000000 x22: 000000000000006c 
[  120.244878] x21: ffffffc001100000 x20: ffffffc0010d0000 
[  120.250248] x19: ffffffc0010d0248 x18: 0000000000000000 
[  120.255610] x17: 0000007f7c0a70a0 x16: ffffffc0001d128c 
[  120.260971] x15: 00000000ce248c14 x14: 0000000000000000 
[  120.266332] x13: 00000000ce248c14 x12: 0000000000000000 
[  120.271692] x11: 0000000000000400 x10: ffffffc0011154a8 
[  120.277054] x9 : 000000000000044d x8 : 206e6f2050554b43 
[  120.282416] x7 : 4f4c206472616820 x6 : ffffff800048fce8 
[  120.287779] x5 : 0000000000000000 x4 : 0000000000000000 
[  120.293136] x3 : ffffffc1f68d7960 x2 : 0000000000010001 
[  120.298497] x1 : ffffffc1f68d4000 x0 : 0000000000000026 
[  120.303860] 
[  120.305371] ---[ end trace f9762fc0c11ae08b ]---
[  120.310000] Call trace:
[  120.312477] [<ffffffc000137e04>] watchdog_timer_fn+0x230/0x33c
[  120.318334] [<ffffffc00010296c>] __hrtimer_run_queues+0x140/0x350
[  120.324447] [<ffffffc0001033cc>] hrtimer_interrupt+0x9c/0x1e0
[  120.330214] [<ffffffc0008429a4>] tegra186_timer_isr+0x24/0x30
[  120.335983] [<ffffffc0000f02f4>] handle_irq_event_percpu+0x84/0x290
[  120.342263] [<ffffffc0000f0544>] handle_irq_event+0x44/0x74
[  120.347856] [<ffffffc0000f384c>] handle_fasteoi_irq+0xb4/0x188
[  120.353712] [<ffffffc0000ef914>] generic_handle_irq+0x24/0x38
[  120.359475] [<ffffffc0000efc1c>] __handle_domain_irq+0x60/0xb4
[  120.365325] [<ffffffc0000815dc>] gic_handle_irq+0x5c/0xb4
[  120.370746] [<ffffffc0000845e8>] el1_irq+0x68/0xd8
[  120.375576] [<ffffffc000736768>] cpuidle_enter+0x18/0x20
[  120.380912] [<ffffffc0000e3040>] call_cpuidle+0x28/0x50
[  120.386158] [<ffffffc0000e31e4>] cpu_startup_entry+0x17c/0x340
[  120.392011] [<ffffffc00008e5ec>] secondary_start_kernel+0x130/0x168
[  120.398289] [<000000008008192c>] 0x8008192c
[0000.268] I> Welcome to MB2(TBoot-BPMP)(version: 01.00.160913-t186-M-00.00-mobile-f4c42291)

crash log04

[   91.412319] Bad mode in Synchronous Abort handler detected, code 0x86000006 -- IABT (current EL)
[   91.421092] Internal error: Oops - bad mode: 0 [#1] PREEMPT SMP
[   91.426998] Modules linked in: tw6869(O) vfat fat fuse bnep uvcvideo snd_usb_audio snd_usbmidi_lib videobuf2_vmalloc bluetooth bcmdhd pci_tegra bluedroid_pm
[   91.441105] CPU: 4 PID: 297 Comm: kworker/4:2 Tainted: G           O    4.4.15 #2
[   91.448570] Hardware name: quill (DT)
[   91.452245] Workqueue: events tw6869_delayed_dma_on [tw6869]
[   91.457898] task: ffffffc1eecd9900 ti: ffffffc1ec504000 task.ti: ffffffc1ec504000
[   91.465362] PC is at 0x0
[   91.467891] LR is at tw6869_delayed_dma_on+0x24/0x174 [tw6869]
[   91.473710] pc : [<0000000000000000>] lr : [<ffffffbffd0ca4e0>] pstate: 600000c5
[   91.481086] sp : ffffffc1ec507d50
[   91.484390] x29: ffffffc1ec507d50 x28: 0000000000000000 
[   91.489705] x27: 0000000000000000 x26: ffffffc00125e5f8 
[   91.495020] x25: 0000000000000000 x24: 0000000000000000 
[   91.500333] x23: ffffffc1ffcfc400 x22: ffffffc1eb71a430 
[   91.505646] x21: ffffffc1ffcf6b80 x20: ffffffc1c6d423b8 
[   91.510958] x19: ffffffc1c6d423e0 x18: 0000000000000021 
[   91.516273] x17: 0000007fb30e7f68 x16: ffffffc000a6da80 
[   91.521587] x15: 00000000fa83b2da x14: 00000000001bfdd4 
[   91.526901] x13: 0000000000000000 x12: 00000000001c13ce 
[   91.532214] x11: 0000000000000400 x10: 00000000000008a0 
[   91.537529] x9 : ffffffc1ec507d30 x8 : ffffffc1eecda200 
[   91.542844] x7 : 0000000000000400 x6 : 00000000000033d8 
[   91.548159] x5 : 0000000000000000 x4 : 0000000000000013 
[   91.553473] x3 : 0000000000000007 x2 : 0000000000000040 
[   91.558786] x1 : 0000000000140013 x0 : ffffffc1c6d40028 
[   91.564101] 
[   91.565585] Process kworker/4:2 (pid: 297, stack limit = 0xffffffc1ec504020)
[   91.572617] Call trace:
[   91.575056] [<          (null)>]           (null)
[   91.579757] [<ffffffc0000b7240>] process_one_work+0x154/0x434
[   91.585490] [<ffffffc0000b7654>] worker_thread+0x134/0x40c
[   91.590966] [<ffffffc0000bcebc>] kthread+0xe0/0xf4
[   91.595748] [<ffffffc000084e10>] ret_from_fork+0x10/0x40
[   91.601047] ---[ end trace ce114fa90476cd84 ]---
[   91.606823] note: kworker/4:2[297] exited with preempt_count 1
[   91.612649] ------------[ cut here ]------------
[   91.617253] WARNING: at ffffffc0000a452c [verbose debug info unavailable]
[   91.624023] Modules linked in: tw6869(O) vfat fat fuse bnep uvcvideo snd_usb_audio snd_usbmidi_lib videobuf2_vmalloc bluetooth bcmdhd pci_tegra bluedroid_pm
[   91.638113] 
[   91.639597] CPU: 4 PID: 297 Comm: kworker/4:2 Tainted: G      D    O    4.4.15 #2
[   91.647060] Hardware name: quill (DT)
[   91.650715] task: ffffffc1eecd9900 ti: ffffffc1ec504000 task.ti: ffffffc1ec504000
[   91.658183] PC is at __local_bh_enable_ip+0x68/0xb8
[   91.663057] LR is at _raw_spin_unlock_bh+0x20/0x28
[   91.667833] pc : [<ffffffc0000a452c>] lr : [<ffffffc000a61ed4>] pstate: 400001c5
[   91.675210] sp : ffffffc1ec507a60
[   91.678513] x29: ffffffc1ec507a60 x28: 0000000000000000 
[   91.683828] x27: 0000000000000000 x26: ffffffc1eecd9900 
[   91.689140] x25: ffffffc1eecd9900 x24: 00000000000001c0 
[   91.694455] x23: 0000000000000001 x22: 0000000000000000 
[   91.699769] x21: ffffffc000e27d68 x20: ffffffc1eecd9900 
[   91.705083] x19: ffffffc0012e2458 x18: 0000000000000021 
[   91.710396] x17: 0000007fb30e7f68 x16: ffffffc000a6da80 
[   91.715710] x15: 00000000fa83b2da x14: 20746e756f635f74 
[   91.721022] x13: 706d656572702068 x12: 7469772064657469 
[   91.726335] x11: 7865205d3739325b x10: 323a342f72656b72 
[   91.731649] x9 : 00000000000003ba x8 : ffffffc1eecda120 
[   91.736964] x7 : 0000000000000000 x6 : ffffff800048f016 
[   91.742277] x5 : ffffffc001119238 x4 : ffffffc001101048 
[   91.747591] x3 : ffffffc001119198 x2 : 0000000000000000 
[   91.752906] x1 : 0000000000000201 x0 : ffffffc00125d000 
[   91.758222] 
[   91.759705] ---[ end trace ce114fa90476cd85 ]---
[   91.764308] Call trace:
[   91.766746] [<ffffffc0000a452c>] __local_bh_enable_ip+0x68/0xb8
[   91.772651] [<ffffffc000a61ed4>] _raw_spin_unlock_bh+0x20/0x28
[   91.778478] [<ffffffc0001272ac>] cgroup_exit+0x58/0xe4
[   91.783611] [<ffffffc0000a20fc>] do_exit+0x29c/0x9a0
[   91.788564] [<ffffffc000089330>] die+0x188/0x1a0
[   91.793170] [<ffffffc00008940c>] arm64_notify_die+0x1c/0x58
[   91.798729] [<ffffffc000089660>] bad_mode+0x84/0x90
[   91.803598] [<ffffffbffd0ca4e0>] tw6869_delayed_dma_on+0x24/0x174 [tw6869]
[   91.810457] [<ffffffc0000b7240>] process_one_work+0x154/0x434
[   91.816187] [<ffffffc0000b7654>] worker_thread+0x134/0x40c
[   91.821659] [<ffffffc0000bcebc>] kthread+0xe0/0xf4
[   91.826437] [<ffffffc000084e10>] ret_from_fork+0x10/0x40
[0000.267] I> Welcome to MB2(TBoot-BPMP)(version: 01.00.160913-t186-M-00.00-mobile-f4c42291)

The Freescale/NXP i.MX6 that driver is based on is 32-bit armhf. The Jetson is 64-bit arm64/aarch64. Different architecture. I suspect some part of the code being run is architecture dependent…perhaps not, and even if 99% of the code is correctly built for arm64, it is possible there is some small piece of incompatible assembler. I have not gone through that code to be sure, but I would start by looking for an arm64 variant.

Hi linuxdev:
Thank you very much for your reply.we have tried another driver but the kernel still randomly crashed.
here is our steps:
1) we got the kernel from www.kernel.org, download the 4.9.147 kernel source
2) in ther source code we found drivers/media/pci/tw686x/
3)modify some files make it compatile with current kernel 4.4.15, only little changes need to be done.we upload the source code to github (https://github.com/liukejob/tw6869-driver-for-tx2.git)
4) compile the driver as a module , install it after system is readly, using v4l2 open the 8 videos created by tw686x.ko

here is our tests:
1) we use v4l2 open the video created, if long time run the kernel crashed, the log is as follows:
crash log 01:

[  910.832322] bwmgr: clk_set_rate failed for freq 18446744073709551506 Hz with errno -22
[  911.836349] sdhci-tegra 3460000.sdhci: clock enable is failed, ret: -110
[0000.277] I> Welcome to MB2(TBoot-BPMP)(version: 01.00.160913-t186-M-00.00-mobile-f4c42291)
[0000.285] I> Default Heap @ [0xd486400 - 0xd488400]

crash log02:

[ 3168.141302] bwmgr: clk_set_rate failed for freq 18446744073709551554 Hz with errno -22
[ 3174.141073] xhci-tegra 3530000.xhci: Timeout while waiting for stop endpoint command
[ 3175.137094] bwmgr: clk_set_rate failed for freq 18446744073709551554 Hz with errno -22
[ 3178.137093] INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 3178.142829]  0-...: (0 ticks this GP) idle=e91/1/0 softirq=516154/516154 fqs=0 
[ 3178.150143]  (detected by 1, t=5303 jiffies, g=276795, c=276794, q=66)
[ 3178.156701] Task dump for CPU 0:
[ 3178.159950] swapper/0       R  running task        0     0      0 0x00000002
[ 3178.167045] Call trace:
[ 3178.169747] rcu_preempt kthread starved for 5303 jiffies! g276795 c276794 f0x0 s3 ->state=0x1
[ 3179.177158] bwmgr: clk_set_rate failed for freq 18446744073709551554 Hz with errno -22
[ 3180.149090] ------------[ cut here ]------------
[ 3180.153708] WARNING: at ffffffc000137e04 [verbose debug info unavailable]
[ 3180.161019] ---[ end trace 3deaae151174838c ]---
[ 3180.165631] Call trace:
[ 3180.173060] ------------[ cut here ]------------
[ 3180.177670] WARNING: at ffffffc000137e04 [verbose debug info unavailable]
[ 3180.184568] ---[ end trace 3deaae151174838d ]---
[ 3180.189173] Call trace:
[ 3192.141103] bwmgr: clk_set_rate failed for freq 18446744073709551554 Hz with errno -22
[ 3204.141141] bwmgr: clk_set_rate failed for freq 18446744073709551554 Hz with errno -22
[ 3216.141202] bwmgr: clk_set_rate failed for freq 18446744073709551554 Hz with errno -22
[ 3241.423251] Bad mode in Synchronous Abort handler detected, code 0x86000006 -- IABT (current EL)
[ 3241.432023] Internal error: Oops - bad mode: 0 [#1] PREEMPT SMP
[ 3241.437928] Modules linked in: tw686x(O) vfat fat fuse bnep bluetooth uvcvideo snd_usb_audio videobuf2_vmalloc snd_usbmidi_lib bcmdhd pci_tegra bluedroid_pm
[ 3241.452021] CPU: 4 PID: 0 Comm: swapper/4 Tainted: G        W  O    4.4.15 #23
[ 3241.459225] Hardware name: quill (DT)
[ 3241.462877] task: ffffffc1f68cb200 ti: ffffffc1f68e0000 task.ti: ffffffc1f68e0000
[ 3241.470342] PC is at 0x0
[ 3241.472873] LR is at t18x_a57_enter_state+0x20/0xc4
[ 3241.477737] pc : [<0000000000000000>] lr : [<ffffffc000875b4c>] pstate: 800000c5
[ 3241.485115] sp : ffffffc1f68e3ec0
[ 3241.488419] x29: ffffffc1f68e3ec0 x28: ffffffc1f68e0000 
[ 3241.493732] x27: ffffffc000a73f20 x26: 000002f2b1e7e080 
[ 3241.499046] x25: ffffffc00126b000 x24: 0000000000000000 
[ 3241.504360] x23: ffffffc0011fee60 x22: ffffffc0011fee78 
[ 3241.509675] x21: ffffffc00126b488 x20: ffffffc001333fd0 
[ 3241.514989] x19: 0000000000000000 x18: 00000000005308b8 
[ 3241.520302] x17: 0000007faf893210 x16: ffffffc0001039b0 
[ 3241.525615] x15: 0000007fb131f000 x14: 0000000000000000 
[ 3241.530930] x13: 0000000000000000 x12: 0000000000000000 
[ 3241.536244] x11: 0000000000000000 x10: 00000000000008a0 
[ 3241.541556] x9 : ffffffc1f68e3ec0 x8 : ffffffc1f68cbb00 
[ 3241.546869] x7 : 000002f2b3e01860 x6 : 0000000000179c6a 
[ 3241.552182] x5 : 0000000000000000 x4 : 00ffffffffffffff 
[ 3241.557495] x3 : 000000003b9aca00 x2 : 00000000000ff6b0 
[ 3241.562808] x1 : 0000000000000000 x0 : 0000000000000000 
[ 3241.568121] 
[ 3241.569605] Process swapper/4 (pid: 0, stack limit = 0xffffffc1f68e0020)
[ 3241.576289] Call trace:
[ 3241.578726] [<          (null)>]           (null)
[ 3241.583419] [<ffffffc00073d3f0>] cpuidle_enter_state+0x88/0x2dc
[ 3241.589323] [<ffffffc00073d67c>] cpuidle_enter+0x18/0x20
[ 3241.594623] [<ffffffc0000e3040>] call_cpuidle+0x28/0x50
[ 3241.599835] [<ffffffc0000e31e4>] cpu_startup_entry+0x17c/0x340
[ 3241.605653] [<ffffffc00008e5ec>] secondary_start_kernel+0x130/0x168
[ 3241.611903] [<000000008008192c>] 0x8008192c
[ 3241.616076] ---[ end trace 3deaae151174838e ]---
[ 3241.622221] Kernel panic - not syncing: Attempted to kill the idle task!
[ 3241.628928] CPU2: stopping
[ 3241.631672] CPU: 2 PID: 0 Comm: swapper/2 Tainted: G      D W  O    4.4.15 #23
[ 3241.638896] Hardware name: quill (DT)
[ 3241.642565] Call trace:
[ 3241.645037] [<ffffffc000088fd8>] dump_backtrace+0x0/0x100
[ 3241.650448] [<ffffffc0000891a0>] show_stack+0x14/0x1c
[ 3241.655511] [<ffffffc0003070f0>] dump_stack+0x90/0xb4
[ 3241.660572] [<ffffffc00008ec10>] handle_IPI+0x300/0x30c
[ 3241.665806] [<ffffffc00008161c>] gic_handle_irq+0x9c/0xb4
[ 3241.671212] [<ffffffc0000845e8>] el1_irq+0x68/0xd8
[ 3241.676015] [<ffffffc00073d67c>] cpuidle_enter+0x18/0x20
[ 3241.681338] [<ffffffc0000e3040>] call_cpuidle+0x28/0x50
[ 3241.686572] [<ffffffc0000e31e4>] cpu_startup_entry+0x17c/0x340
[ 3241.692412] [<ffffffc00008e5ec>] secondary_start_kernel+0x130/0x168
[ 3241.698685] [<000000008008192c>] 0x8008192c
[ 3241.702882] CPU1: stopping
[ 3241.705626] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G      D W  O    4.4.15 #23
[ 3241.712850] Hardware name: quill (DT)
[ 3241.716517] Call trace:
[ 3241.718985] [<ffffffc000088fd8>] dump_backtrace+0x0/0x100
[ 3241.724397] [<ffffffc0000891a0>] show_stack+0x14/0x1c
[ 3241.729457] [<ffffffc0003070f0>] dump_stack+0x90/0xb4
[ 3241.734537] [<ffffffc00008ec10>] handle_IPI+0x300/0x30c
[ 3241.739770] [<ffffffc00008161c>] gic_handle_irq+0x9c/0xb4
[ 3241.745177] [<ffffffc0000845e8>] el1_irq+0x68/0xd8
[ 3241.749976] [<ffffffc00073d67c>] cpuidle_enter+0x18/0x20
[ 3241.755299] [<ffffffc0000e3040>] call_cpuidle+0x28/0x50
[ 3241.760533] [<ffffffc0000e31e4>] cpu_startup_entry+0x17c/0x340
[ 3241.766373] [<ffffffc00008e5ec>] secondary_start_kernel+0x130/0x168
[ 3241.772643] [<000000008008192c>] 0x8008192c
[ 3242.670680] SMP: failed to stop secondary CPUs
[ 3242.680333] Rebooting in 5 seconds..
[ 3248.725080] SMP: failed to stop secondary CPUs
[0000.276] I> Welcome to MB2(TBoot-BPMP)(version: 01.00.160913-t186-M-00.00-mobile-f4c42291)
  1. If we keep turning on and off the video kernel will crash quickly, the log as follows:
[  173.201650] tegra-pcie 10003000.pcie-controller: PCIE: Response decoding error, signature: 50100001
[  173.210735] tegra-pcie 10003000.pcie-controller: PCIE: Response decoding error, signature: 50100001
[  173.219835] tegra-pcie 10003000.pcie-controller: PCIE: Response decoding error, signature: 50100001
[  173.228916] tegra-pcie 10003000.pcie-controller: PCIE: Response decoding error, signature: 50100001
[  173.237996] tegra-pcie 10003000.pcie-controller: PCIE: Response decoding error, signature: 50100001
[  173.247085] tegra-pcie 10003000.pcie-controller: PCIE: Response decoding error, signature: 50100001
[  173.256167] tegra-pcie 10003000.pcie-controller: PCIE: Response decoding error, signature: 50100001
[  173.265257] tegra-pcie 10003000.pcie-controller: PCIE: Response decoding error, signature: 50100001
[  173.274337] tegra-pcie 10003000.pcie-controller: PCIE: Response decoding error, signature: 50100001
[  173.283427] tegra-pcie 10003000.pcie-controller: PCIE: Response decoding error, signature: 50100001
[  173.292509] tegra-pcie 10003000.pcie-controller: PCIE: Response decoding error, signature: 50100001
[  173.301599] tegra-pcie 10003000.pcie-controller: PCIE: Response decoding error, signature: 50100001
[  173.310682] tegra-pcie 10003000.pcie-controller: PCIE: Response decoding error, signature: 50100001
[  173.319771] tegra-pcie 10003000.pcie-controller: PCIE: Response decoding error, signature: 50100001
[  173.328855] tegra-pcie 10003000.pcie-controller: PCIE: Response decoding error, signature: 50100001
[  173.337963] tegra-pcie 10003000.pcie-controller: PCIE: Response decoding error, signature: 50100001
[  173.347043] tegra-pcie 10003000.pcie-controller: PCIE: Response decoding error, signature: 50100001
[  173.356134] tegra-pcie 10003000.pcie-controller: PCIE: Response decoding error, signature: 50100001
[  173.365233] tegra-pcie 10003000.pcie-controller: PCIE: Response decoding error, signature: 50100001
[  173.374314] tegra-pcie 10003000.pcie-controller: PCIE: Response decoding error, signature: 50100001
[  173.383394] tegra-pcie 10003000.pcie-controller: PCIE: Response decoding error, signature: 50100001
[  173.392483] tegra-pcie 10003000.pcie-controller: PCIE: Response decoding error, signature: 50100001
[  173.401565] tegra-pcie 10003000.pcie-controller: PCIE: Response decoding error, signature: 50100001
[  173.410653] tegra-pcie 10003000.pcie-controller: PCIE: Response decoding error, signature: 50100001
[  173.419735] tegra-pcie 10003000.pcie-controller: PCIE: Response decoding error, signature: 50100001
[  173.428824] tegra-pcie 10003000.pcie-controller: PCIE: Response decoding error, signature: 50100001
[  173.437905] tegra-pcie 10003000.pcie-controller: PCIE:[0000.227] I> Welcome to MB2(TBoot-BPMP)(version: 01.00.160913-t186-M-00.00-mobile-f4c42291)
[0000.236] I> Default Heap @ [0xd486400 - 0xd488400]
[0000.240] I> DMA Heap @ [0x84900000 - 0x85300000]
  1. we move the code to another arm64 platform(hi3359a), do the same test. Every thing works fine.

We really confused what happed, could give us some ideas on how to debug this ? thanks very much for your help!!!

I’ve never worked on the driver, so someone else may need to help. However, if the driver can be made as a module (not all drivers can be a module…did the Kconfig say it was valid?), then a common problem is if DMA is used incorrectly (when the SMMU/DMA uses physical addresses versus virtual addressing you have to adjust the driver to that use-case). Another issue might be if there is some code in the driver which has to be ported to arm64 and makes assumptions only valid for x86_64/amd64.

I do see mention there of DMA Heap. Thus I would probably start by finding out if the driver correctly uses the SMMU. On the older arm32 I believe the SMMU was not used for DMA on PCIe, but on arm64 it is used by default (although this will differ if you are not using a recent release…not sure about R27.1, but you’d probably be far better off to start by using R28.2.1).