TX2: Unhandled translation errors when enabling the Jailhouse hypervisor

We’re running the default Nvidia 4.4 kernel on the TX2.
Whenever we enable the Jailhouse hypervisor (https://github.com/evidence/linux-jailhouse-jetson) we get segmentation faults of the user-space applications using accelerated graphics (e.g. Chrome).

Dmesg reports “unhandled level 1 translation fault” (see log below) while Jailhouse doesn’t report any error.

Could such issue be somehow related to SMMU ?
Should we reprogram SMMU when running a hypervisor ?

Many thanks and best regards.

=====================================

[ 1285.996678] randomFog[2250]: unhandled level 1 translation fault (11) at 0x900001244, esr 0x92000045
[ 1285.996691] pgd = ffffffc063dd4000
[ 1285.996697] [900001244] *pgd=0000000000000000, *pud=0000000000000000

[ 1285.996713] CPU: 4 PID: 2250 Comm: randomFog Tainted: G O 4.4.38 #70
[ 1285.996719] Hardware name: quill (DT)
[ 1285.996726] task: ffffffc1d7e2a580 ti: ffffffc0656d4000 task.ti: ffffffc0656d4000
[ 1285.996733] PC is at 0x7f740390d0
[ 1285.996738] LR is at 0x7f74038cb0
[ 1285.996743] pc : [<0000007f740390d0>] lr : [<0000007f74038cb0>] pstate: 80000000
[ 1285.996748] sp : 0000007fc9a657b0
[ 1285.996753] x29: 0000007fc9a65f80 x28: 00000000006f3d98
[ 1285.996762] x27: 0000007fc9a65910 x26: 0000000000000002
[ 1285.996770] x25: 000000000076b388 x24: 00000000007796e0
[ 1285.996777] x23: 000000000076b380 x22: 0000000080000124
[ 1285.996784] x21: 000000000076b588 x20: 000000000076b380
[ 1285.996792] x19: 000000000076d5f0 x18: 0000000000010000
[ 1285.996799] x17: 0000007f771ce760 x16: 0000007f7464ab48
[ 1285.996806] x15: 0000007f7b454000 x14: 0000000000000000
[ 1285.996813] x13: 0000000000000000 x12: 0000000000000000
[ 1285.996820] x11: 0000000000000000 x10: 0000000000000000
[ 1285.996827] x9 : 0000000000000004 x8 : 0000000000000040
[ 1285.996834] x7 : 0000000000000003 x6 : 0000000000000000
[ 1285.996841] x5 : 0000007f7b4486f0 x4 : 0000000000000000
[ 1285.996848] x3 : 000000000076d5f0 x2 : 000000000076b380
[ 1285.996856] x1 : 0000000080000124 x0 : 0000000900001240

[ 1285.996871] Library at 0x7f740390d0: 0x7f73d65000 /usr/lib/aarch64-linux-gnu/tegra/libcuda.so.1.1
[ 1285.996877] Library at 0x7f74038cb0: 0x7f73d65000 /usr/lib/aarch64-linux-gnu/tegra/libcuda.so.1.1
[ 1285.996883] vdso base = 0x7f7b453000
[ 1343.375250] ---- mlocks ----

[ 1343.375292] ---- syncpts ----
[ 1343.375304] id 4 (disp_d) min 30309 max 30310 refs 1 (previous client : )
[ 1343.375312] id 5 (disp_e) min 2 max 2 refs 1 (previous client : )
[ 1343.375320] id 7 (vblank1) min 33577 max 0 refs 1 (previous client : )
[ 1343.375337] id 18 (17000000.gp10b_507) min 166884 max 166884 refs 1 (previous client : )
[ 1343.375343] id 19 (17000000.gp10b_506) min 246 max 246 refs 1 (previous client : )
[ 1343.375351] id 21 (17000000.gp10b_505) min 161038 max 161044 refs 1 (previous client : 17000000.gp10b_505)
[ 1343.375358] id 22 (17000000.gp10b_504) min 171382 max 171382 refs 1 (previous client : )
[ 1343.375365] id 23 (17000000.gp10b_503) min 12 max 12 refs 1 (previous client : 17000000.gp10b_503)
[ 1343.375372] id 25 (17000000.gp10b_501) min 2 max 4 refs 1 (previous client : )

[ 1343.375919] ---- channels ----
[ 1343.375931]
channel 1 - 15820000.se

[ 1343.375938] NvHost basic channel registers:
[ 1343.375944] CMDFIFO_STAT_0: 00002040
[ 1343.375950] CMDFIFO_RDATA_0: 440e1449
[ 1343.375956] CMDP_OFFSET_0: 00000000
[ 1343.375961] CMDP_CLASS_0: 00000000
[ 1343.375967] CHANNELSTAT_0: 00000000
[ 1343.375971] The CDMA sync queue is empty.

[ 1343.375980]
channel 2 - 15830000.se

[ 1343.375987] NvHost basic channel registers:
[ 1343.375992] CMDFIFO_STAT_0: 00002040
[ 1343.375997] CMDFIFO_RDATA_0: 00104067
[ 1343.376004] CMDP_OFFSET_0: 00000000
[ 1343.376008] CMDP_CLASS_0: 00000000
[ 1343.376013] CHANNELSTAT_0: 00000000
[ 1343.376018] The CDMA sync queue is empty.

[ 1343.376026]
channel 3 - 15840000.se

[ 1343.376033] NvHost basic channel registers:
[ 1343.376038] CMDFIFO_STAT_0: 00002040
[ 1343.376043] CMDFIFO_RDATA_0: a0420880
[ 1343.376049] CMDP_OFFSET_0: 00000000
[ 1343.376054] CMDP_CLASS_0: 00000000
[ 1343.376059] CHANNELSTAT_0: 00000000
[ 1343.376063] The CDMA sync queue is empty.

[ 1343.376073]
---- host general irq ----

[ 1343.376081] sync_intc0mask = 0x00000001
[ 1343.376087] sync_intmask = 0x50000003
[ 1343.376091]
---- host syncpt irq mask ----

[ 1343.376098]
---- host syncpt irq status ----

[ 1343.376106] syncpt_thresh_cpu0_int_status(0) = 0x00000000
[ 1343.376111] syncpt_thresh_cpu0_int_status(1) = 0x00000000
[ 1343.376117] syncpt_thresh_cpu0_int_status(2) = 0x00000000
[ 1343.376122] syncpt_thresh_cpu0_int_status(3) = 0x00000000
[ 1343.376127] syncpt_thresh_cpu0_int_status(4) = 0x00000000
[ 1343.376132] syncpt_thresh_cpu0_int_status(5) = 0x00000000
[ 1343.376138] syncpt_thresh_cpu0_int_status(6) = 0x00000000
[ 1343.376143] syncpt_thresh_cpu0_int_status(7) = 0x00000000
[ 1343.376148] syncpt_thresh_cpu0_int_status(8) = 0x00000000
[ 1343.376153] syncpt_thresh_cpu0_int_status(9) = 0x00000000
[ 1343.376158] syncpt_thresh_cpu0_int_status(10) = 0x00000000
[ 1343.376164] syncpt_thresh_cpu0_int_status(11) = 0x00000000
[ 1343.376169] syncpt_thresh_cpu0_int_status(12) = 0x00000000
[ 1343.376174] syncpt_thresh_cpu0_int_status(13) = 0x00000000
[ 1343.376179] syncpt_thresh_cpu0_int_status(14) = 0x00000000
[ 1343.376184] syncpt_thresh_cpu0_int_status(15) = 0x00000000
[ 1343.376189] syncpt_thresh_cpu0_int_status(16) = 0x00000000
[ 1343.376194] syncpt_thresh_cpu0_int_status(17) = 0x00000000
[ 1343.376201] 17000000.gp10b pbdma 0:
[ 1343.376206] id: 2 (tsg), next_id: 2 (tsg) chan status: invalid
[ 1343.376220] PUT: 0000001e089626f0 GET: 0000001e089606a8 FETCH: 00000ca4 HEADER: 20000df8

[ 1343.376231] 17000000.gp10b eng 0:
[ 1343.376235] id: 5 (tsg), next_id: 2 (tsg), ctx status: switch
[ 1343.376240] faulted
[ 1343.376242] busy

[ 1343.376252] 17000000.gp10b eng 1:
[ 1343.376256] id: 6 (tsg), next_id: 6 (tsg), ctx status: invalid

[ 1343.376420] 498-17000000.gp10b, pid 2260, refs: 2:
[ 1343.376425] channel status: in use idle not busy
[ 1343.376434] RAMFC : TOP: 8000000100650514 PUT: 0000000100650514 GET: 0000000100650514 FETCH: 0000000100650514
HEADER: 60400000 COUNT: 80000000
SYNCPOINT 00000000 00000000 SEMAPHORE 00000001 0000fbe0 000000c8 00001004

[ 1343.376449] 499-17000000.gp10b, pid 2260, refs: 2:
[ 1343.376453] channel status: in use idle not busy
[ 1343.376461] RAMFC : TOP: 8000000100530294 PUT: 0000000100530294 GET: 0000000100530294 FETCH: 0000000100530294
HEADER: 60400000 COUNT: 80000000
SYNCPOINT 00000000 00000000 SEMAPHORE 00000000 00000000 00000000 00000000

[ 1343.376474] 500-17000000.gp10b, pid 2260, refs: 2:
[ 1343.376478] channel status: in use idle not busy
[ 1343.376486] RAMFC : TOP: 8000000100430294 PUT: 0000000100430294 GET: 0000000100430294 FETCH: 0000000100430294
HEADER: 60400000 COUNT: 80000000
SYNCPOINT 00000000 00000000 SEMAPHORE 00000000 00000000 00000000 00000000

[ 1343.376499] 501-17000000.gp10b, pid 2260, refs: 4:
[ 1343.376503] channel status: in use on_eng_pending busy
[ 1343.376511] RAMFC : TOP: 8000001f08140018 PUT: 0000001f08140030 GET: 0000001f08140018 FETCH: 0000001f08140030
HEADER: 20011b08 COUNT: a0110002
SYNCPOINT 00000000 00001901 SEMAPHORE 00000001 00017ffc 00000002 00080004

[ 1343.376525] 502-17000000.gp10b, pid 2260, refs: 2:
[ 1343.376528] channel status: in use idle not busy
[ 1343.376536] RAMFC : TOP: 80000001002311ac PUT: 00000001002311ac GET: 00000001002311ac FETCH: 00000001002311ac
HEADER: 60400000 COUNT: 80000000
SYNCPOINT 00000000 00000000 SEMAPHORE 00000001 0000fba0 00000004 00001004

[ 1343.376549] 503-17000000.gp10b, pid 2260, refs: 2:
[ 1343.376552] channel status: in use idle not busy
[ 1343.376560] RAMFC : TOP: 8000001f08000060 PUT: 0000001f08000060 GET: 0000001f08000060 FETCH: 0000001f08000060
HEADER: 60400000 COUNT: 80000000
SYNCPOINT 00000000 00001701 SEMAPHORE 0000001e 020d0aa0 00000000 00000002

[ 1343.376573] 504-17000000.gp10b, pid 1991, refs: 2:
[ 1343.376577] channel status: in use idle not busy
[ 1343.376584] RAMFC : TOP: 8000001f0801d760 PUT: 0000001f0801d760 GET: 0000001f0801d760 FETCH: 0000001f0801d760
HEADER: 60400000 COUNT: 80000000
SYNCPOINT 00000000 00001601 SEMAPHORE 0000001e 012a0aa0 00000000 00000002

[ 1343.376597] 505-17000000.gp10b, pid 1587, refs: 8:
[ 1343.376601] channel status: in use on_eng_pending busy
[ 1343.376608] RAMFC : TOP: 8000001e089606a8 PUT: 0000001e089626f0 GET: 0000001e089606a8 FETCH: 0017e01e08961b00
HEADER: 20000df8 COUNT: 01110002
SYNCPOINT 00000000 00001501 SEMAPHORE 0000001e 035a00d0 00000001 00001001

[ 1343.376622] 506-17000000.gp10b, pid 888, refs: 2:
[ 1343.376625] channel status: in use idle not busy
[ 1343.376633] RAMFC : TOP: 8000001f08140b88 PUT: 0000001f08140b88 GET: 0000001f08140b88 FETCH: 0000001f08140b88
HEADER: 60400000 COUNT: 80000000
SYNCPOINT 00000000 00001301 SEMAPHORE 00000000 00000000 00000000 00000000

[ 1343.376647] 507-17000000.gp10b, pid 888, refs: 2:
[ 1343.376650] channel status: in use idle not busy
[ 1343.376658] RAMFC : TOP: 8000001f0800be40 PUT: 0000001f0800be40 GET: 0000001f0800be40 FETCH: 0000001f0800be40
HEADER: 60400000 COUNT: 80000000
SYNCPOINT 00000000 00001201 SEMAPHORE 0000000d fc001000 3e209dd4 01100002

[ 1343.376671] 508-17000000.gp10b, pid 891, refs: 2:
[ 1343.376674] channel status: in use idle not busy
[ 1343.376682] RAMFC : TOP: 0000000000000000 PUT: 0000000000000000 GET: 0000000000000000 FETCH: 0000000000000000
HEADER: 60400000 COUNT: 00000000
SYNCPOINT 00000000 00000000 SEMAPHORE 00000000 00000000 00000000 00000000

[ 1343.376695] 509-17000000.gp10b, pid 891, refs: 2:
[ 1343.376698] channel status: in use idle not busy
[ 1343.376705] RAMFC : TOP: 0000000000000000 PUT: 0000000000000000 GET: 0000000000000000 FETCH: 0000000000000000
HEADER: 60400000 COUNT: 00000000
SYNCPOINT 00000000 00000000 SEMAPHORE 00000000 00000000 00000000 00000000

[ 1343.376719] 510-17000000.gp10b, pid 891, refs: 2:
[ 1343.376722] channel status: in use idle not busy
[ 1343.376729] RAMFC : TOP: 0000000000000000 PUT: 0000000000000000 GET: 0000000000000000 FETCH: 0000000000000000
HEADER: 60400000 COUNT: 00000000
SYNCPOINT 00000000 00000000 SEMAPHORE 00000000 00000000 00000000 00000000

[ 1343.376742] 511-17000000.gp10b, pid 891, refs: 2:
[ 1343.376746] channel status: in use idle not busy
[ 1343.376753] RAMFC : TOP: 0000000000000000 PUT: 0000000000000000 GET: 0000000000000000 FETCH: 0000000000000000
HEADER: 60400000 COUNT: 00000000
SYNCPOINT 00000000 00000000 SEMAPHORE 00000000 00000000 00000000 00000000

[ 1343.376778] gk20a 17000000.gp10b: gk20a_fifo_handle_mmu_fault: mmu fault on engine 0, engine subid 0 (gpc), client 7 (t1 2), addr 0x00000010:0x40404000, type 0 (pde), info 0x00010700,inst_ptr 0xe3da9000

[ 1343.376788] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: gr_fecs_os_r : 0
[ 1343.376795] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: gr_fecs_cpuctl_r : 0x40
[ 1343.376801] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: gr_fecs_idlestate_r : 0x1
[ 1343.376808] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: gr_fecs_mailbox0_r : 0x3f
[ 1343.376815] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: gr_fecs_mailbox1_r : 0x0
[ 1343.376821] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: gr_fecs_irqstat_r : 0x0
[ 1343.376828] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: gr_fecs_irqmode_r : 0x4
[ 1343.376834] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: gr_fecs_irqmask_r : 0x8704
[ 1343.376841] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: gr_fecs_irqdest_r : 0x0
[ 1343.376847] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: gr_fecs_debug1_r : 0x40
[ 1343.376854] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: gr_fecs_debuginfo_r : 0x0
[ 1343.376860] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: gr_fecs_ctxsw_mailbox_r(0) : 0x4
[ 1343.376867] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: gr_fecs_ctxsw_mailbox_r(1) : 0x0
[ 1343.376874] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: gr_fecs_ctxsw_mailbox_r(2) : 0x50009
[ 1343.376880] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: gr_fecs_ctxsw_mailbox_r(3) : 0x20
[ 1343.376887] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: gr_fecs_ctxsw_mailbox_r(4) : 0x3ffd20
[ 1343.376893] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: gr_fecs_ctxsw_mailbox_r(5) : 0x0
[ 1343.376900] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: gr_fecs_ctxsw_mailbox_r(6) : 0x0
[ 1343.376906] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: gr_fecs_ctxsw_mailbox_r(7) : 0x0
[ 1343.376912] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: gr_fecs_engctl_r : 0x0
[ 1343.376919] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: gr_fecs_curctx_r : 0x0
[ 1343.376925] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: gr_fecs_nxtctx_r : 0x0
[ 1343.376933] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: FECS_FALCON_REG_IMB : 0xbadfbadf
[ 1343.376940] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: FECS_FALCON_REG_DMB : 0xbadfbadf
[ 1343.376948] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: FECS_FALCON_REG_CSW : 0xbadfbadf
[ 1343.376955] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: FECS_FALCON_REG_CTX : 0xbadfbadf
[ 1343.376962] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: FECS_FALCON_REG_EXCI : 0xbadfbadf
[ 1343.376970] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: FECS_FALCON_REG_PC : 0xbadfbadf
[ 1343.376977] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: FECS_FALCON_REG_SP : 0xbadfbadf
[ 1343.376985] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: FECS_FALCON_REG_PC : 0xbadfbadf
[ 1343.376992] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: FECS_FALCON_REG_SP : 0xbadfbadf
[ 1343.377000] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: FECS_FALCON_REG_PC : 0xbadfbadf
[ 1343.377007] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: FECS_FALCON_REG_SP : 0xbadfbadf
[ 1343.377014] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: FECS_FALCON_REG_PC : 0xbadfbadf
[ 1343.377022] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: FECS_FALCON_REG_SP : 0xbadfbadf
[ 1343.377028] gk20a 17000000.gp10b: gk20a_fifo_handle_mmu_fault: gr_status_r : 0x1000081
[ 1343.383856] gk20a 17000000.gp10b: gk20a_fifo_set_ctx_mmu_error_tsg: TSG 5 generated a mmu fault
[ 1343.383871] gk20a 17000000.gp10b: gk20a_set_error_notifier_locked: error notifier set to 31 for ch 502
[ 1343.383935] gk20a 17000000.gp10b: fifo_error_isr: channel reset initiated from fifo_error_isr; intr=0x10000000
[ 1359.279991] ---- mlocks ----

[ 1359.280032] ---- syncpts ----
[ 1359.280045] id 4 (disp_d) min 31254 max 31254 refs 1 (previous client : )
[ 1359.280052] id 5 (disp_e) min 2 max 2 refs 1 (previous client : )
[ 1359.280059] id 7 (vblank1) min 34531 max 0 refs 1 (previous client : )
[ 1359.280076] id 18 (17000000.gp10b_507) min 172676 max 172678 refs 1 (previous client : )
[ 1359.280082] id 19 (17000000.gp10b_506) min 250 max 250 refs 1 (previous client : )
[ 1359.280089] id 21 (17000000.gp10b_505) min 165874 max 165874 refs 1 (previous client : 17000000.gp10b_505)
[ 1359.280096] id 22 (17000000.gp10b_504) min 175204 max 175204 refs 1 (previous client : )
[ 1359.280102] id 23 (17000000.gp10b_502) min 22 max 22 refs 1 (previous client : 17000000.gp10b_503)
[ 1359.280109] id 25 (17000000.gp10b_500) min 6 max 8 refs 1 (previous client : 17000000.gp10b_501)

[ 1359.280651] ---- channels ----
[ 1359.280664]
channel 1 - 15820000.se

[ 1359.280670] NvHost basic channel registers:
[ 1359.280676] CMDFIFO_STAT_0: 00002040
[ 1359.280681] CMDFIFO_RDATA_0: 440e1449
[ 1359.280688] CMDP_OFFSET_0: 00000000
[ 1359.280693] CMDP_CLASS_0: 00000000
[ 1359.280698] CHANNELSTAT_0: 00000000
[ 1359.280702] The CDMA sync queue is empty.

[ 1359.280710]
channel 2 - 15830000.se

[ 1359.280717] NvHost basic channel registers:
[ 1359.280722] CMDFIFO_STAT_0: 00002040
[ 1359.280726] CMDFIFO_RDATA_0: 00104067
[ 1359.280733] CMDP_OFFSET_0: 00000000
[ 1359.280737] CMDP_CLASS_0: 00000000
[ 1359.280742] CHANNELSTAT_0: 00000000
[ 1359.280746] The CDMA sync queue is empty.

[ 1359.280755]
channel 3 - 15840000.se

[ 1359.280761] NvHost basic channel registers:
[ 1359.280765] CMDFIFO_STAT_0: 00002040
[ 1359.280770] CMDFIFO_RDATA_0: a0420880
[ 1359.280776] CMDP_OFFSET_0: 00000000
[ 1359.280781] CMDP_CLASS_0: 00000000
[ 1359.280786] CHANNELSTAT_0: 00000000
[ 1359.280789] The CDMA sync queue is empty.

[ 1359.280799]
---- host general irq ----

[ 1359.280806] sync_intc0mask = 0x00000001
[ 1359.280811] sync_intmask = 0x50000003
[ 1359.280815]
---- host syncpt irq mask ----

[ 1359.280822]
---- host syncpt irq status ----

[ 1359.280829] syncpt_thresh_cpu0_int_status(0) = 0x00000000
[ 1359.280835] syncpt_thresh_cpu0_int_status(1) = 0x00000000
[ 1359.280840] syncpt_thresh_cpu0_int_status(2) = 0x00000000
[ 1359.280845] syncpt_thresh_cpu0_int_status(3) = 0x00000000
[ 1359.280849] syncpt_thresh_cpu0_int_status(4) = 0x00000000
[ 1359.280854] syncpt_thresh_cpu0_int_status(5) = 0x00000000
[ 1359.280859] syncpt_thresh_cpu0_int_status(6) = 0x00000000
[ 1359.280864] syncpt_thresh_cpu0_int_status(7) = 0x00000000
[ 1359.280869] syncpt_thresh_cpu0_int_status(8) = 0x00000000
[ 1359.280874] syncpt_thresh_cpu0_int_status(9) = 0x00000000
[ 1359.280879] syncpt_thresh_cpu0_int_status(10) = 0x00000000
[ 1359.280885] syncpt_thresh_cpu0_int_status(11) = 0x00000000
[ 1359.280890] syncpt_thresh_cpu0_int_status(12) = 0x00000000
[ 1359.280895] syncpt_thresh_cpu0_int_status(13) = 0x00000000
[ 1359.280900] syncpt_thresh_cpu0_int_status(14) = 0x00000000
[ 1359.280905] syncpt_thresh_cpu0_int_status(15) = 0x00000000
[ 1359.280910] syncpt_thresh_cpu0_int_status(16) = 0x00000000
[ 1359.280915] syncpt_thresh_cpu0_int_status(17) = 0x00000000
[ 1359.280921] 17000000.gp10b pbdma 0:
[ 1359.280926] id: 0 (tsg), next_id: 0 (tsg) chan status: invalid
[ 1359.280939] PUT: 0000001e0001393c GET: 0000001e00013938 FETCH: 000003c9 HEADER: 20400010

[ 1359.280949] 17000000.gp10b eng 0:
[ 1359.280953] id: 5 (tsg), next_id: 0 (tsg), ctx status: switch
[ 1359.280958] faulted
[ 1359.280961] busy

[ 1359.280970] 17000000.gp10b eng 1:
[ 1359.280973] id: 6 (tsg), next_id: 6 (tsg), ctx status: invalid

[ 1359.281133] 498-17000000.gp10b, pid 2265, refs: 2:
[ 1359.281137] channel status: in use idle not busy
[ 1359.281145] RAMFC : TOP: 8000000100530294 PUT: 0000000100530294 GET: 0000000100530294 FETCH: 0000000100530294
HEADER: 60400000 COUNT: 80000000
SYNCPOINT 00000000 00000000 SEMAPHORE 00000000 00000000 00000000 00000000

[ 1359.281159] 499-17000000.gp10b, pid 2265, refs: 2:
[ 1359.281163] channel status: in use idle not busy
[ 1359.281170] RAMFC : TOP: 8000000100430294 PUT: 0000000100430294 GET: 0000000100430294 FETCH: 0000000100430294
HEADER: 60400000 COUNT: 80000000
SYNCPOINT 00000000 00000000 SEMAPHORE 00000000 00000000 00000000 00000000

[ 1359.281183] 500-17000000.gp10b, pid 2265, refs: 4:
[ 1359.281187] channel status: in use on_eng_pending busy
[ 1359.281194] RAMFC : TOP: 8000001f08140018 PUT: 0000001f08140030 GET: 0000001f08140018 FETCH: 0000001f08140030
HEADER: 20011b08 COUNT: a0110002
SYNCPOINT 00000000 00001901 SEMAPHORE 00000001 00017ffc 00000002 00080004

[ 1359.281207] 501-17000000.gp10b, pid 2265, refs: 2:
[ 1359.281210] channel status: in use idle not busy
[ 1359.281218] RAMFC : TOP: 80000001002311ac PUT: 00000001002311ac GET: 00000001002311ac FETCH: 00000001002311ac
HEADER: 60400000 COUNT: 80000000
SYNCPOINT 00000000 00000000 SEMAPHORE 00000001 0000fba0 00000004 00001004

[ 1359.281231] 502-17000000.gp10b, pid 2265, refs: 2:
[ 1359.281234] channel status: in use idle not busy
[ 1359.281241] RAMFC : TOP: 8000001f08000060 PUT: 0000001f08000060 GET: 0000001f08000060 FETCH: 0000001f08000060
HEADER: 60400000 COUNT: 80000000
SYNCPOINT 00000000 00001701 SEMAPHORE 0000001e 020d0aa0 00000000 00000002

[ 1359.281254] 503-17000000.gp10b, pid 2265, refs: 2:
[ 1359.281257] channel status: in use idle not busy
[ 1359.281264] RAMFC : TOP: 8000000100650500 PUT: 0000000100650500 GET: 0000000100650500 FETCH: 0000000100650500
HEADER: 60400000 COUNT: 80000000
SYNCPOINT 00000000 00000000 SEMAPHORE 00000000 00000000 00000000 00000000

[ 1359.281276] 504-17000000.gp10b, pid 1991, refs: 2:
[ 1359.281279] channel status: in use idle not busy
[ 1359.281287] RAMFC : TOP: 8000001f0802c640 PUT: 0000001f0802c640 GET: 0000001f0802c640 FETCH: 0000001f0802c640
HEADER: 60400000 COUNT: 80000000
SYNCPOINT 00000000 00001601 SEMAPHORE 0000001e 012a0aa0 00000000 00000002

[ 1359.281299] 505-17000000.gp10b, pid 1587, refs: 2:
[ 1359.281302] channel status: in use idle not busy
[ 1359.281309] RAMFC : TOP: 8000001f08001d80 PUT: 0000001f08001d80 GET: 0000001f08001d80 FETCH: 0000001f08001d80
HEADER: 60400000 COUNT: 80000000
SYNCPOINT 00000000 00001501 SEMAPHORE 0000001e 030c0aa0 0003565e 00001004

[ 1359.281323] 506-17000000.gp10b, pid 888, refs: 2:
[ 1359.281364] channel status: in use idle not busy
[ 1359.281379] RAMFC : TOP: 8000001f08140bb8 PUT: 0000001f08140bb8 GET: 0000001f08140bb8 FETCH: 0000001f08140bb8
HEADER: 60400000 COUNT: 80000000
SYNCPOINT 00000000 00001301 SEMAPHORE 00000000 00000000 00000000 00000000

[ 1359.281402] 507-17000000.gp10b, pid 888, refs: 4:
[ 1359.281409] channel status: in use on_eng_pending busy
[ 1359.281419] RAMFC : TOP: 8000001e00013938 PUT: 0000001e0001393c GET: 0000001e00013938 FETCH: 0000001f08022860
HEADER: 20400010 COUNT: 01110004
SYNCPOINT 00000000 00001201 SEMAPHORE 0000000d fc001000 28207580 01100002

[ 1359.281440] 508-17000000.gp10b, pid 891, refs: 2:
[ 1359.281445] channel status: in use idle not busy
[ 1359.281456] RAMFC : TOP: 0000000000000000 PUT: 0000000000000000 GET: 0000000000000000 FETCH: 0000000000000000
HEADER: 60400000 COUNT: 00000000
SYNCPOINT 00000000 00000000 SEMAPHORE 00000000 00000000 00000000 00000000

[ 1359.281478] 509-17000000.gp10b, pid 891, refs: 2:
[ 1359.281485] channel status: in use idle not busy
[ 1359.281498] RAMFC : TOP: 0000000000000000 PUT: 0000000000000000 GET: 0000000000000000 FETCH: 0000000000000000
HEADER: 60400000 COUNT: 00000000
SYNCPOINT 00000000 00000000 SEMAPHORE 00000000 00000000 00000000 00000000

[ 1359.281520] 510-17000000.gp10b, pid 891, refs: 2:
[ 1359.281527] channel status: in use idle not busy
[ 1359.281539] RAMFC : TOP: 0000000000000000 PUT: 0000000000000000 GET: 0000000000000000 FETCH: 0000000000000000
HEADER: 60400000 COUNT: 00000000
SYNCPOINT 00000000 00000000 SEMAPHORE 00000000 00000000 00000000 00000000

[ 1359.281551] 511-17000000.gp10b, pid 891, refs: 2:
[ 1359.281554] channel status: in use idle not busy
[ 1359.281561] RAMFC : TOP: 0000000000000000 PUT: 0000000000000000 GET: 0000000000000000 FETCH: 0000000000000000
HEADER: 60400000 COUNT: 00000000
SYNCPOINT 00000000 00000000 SEMAPHORE 00000000 00000000 00000000 00000000

[ 1359.281585] gk20a 17000000.gp10b: gk20a_fifo_handle_mmu_fault: mmu fault on engine 0, engine subid 0 (gpc), client 7 (t1 2), addr 0x00000010:0x40404000, type 0 (pde), info 0x00010700,inst_ptr 0x246a57000

[ 1359.281594] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: gr_fecs_os_r : 0
[ 1359.281602] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: gr_fecs_cpuctl_r : 0x40
[ 1359.281608] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: gr_fecs_idlestate_r : 0x1
[ 1359.281615] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: gr_fecs_mailbox0_r : 0x3f
[ 1359.281621] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: gr_fecs_mailbox1_r : 0x0
[ 1359.281627] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: gr_fecs_irqstat_r : 0x0
[ 1359.281633] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: gr_fecs_irqmode_r : 0x4
[ 1359.281639] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: gr_fecs_irqmask_r : 0x8704
[ 1359.281645] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: gr_fecs_irqdest_r : 0x0
[ 1359.281650] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: gr_fecs_debug1_r : 0x40
[ 1359.281656] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: gr_fecs_debuginfo_r : 0x0
[ 1359.281662] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: gr_fecs_ctxsw_mailbox_r(0) : 0x4
[ 1359.281669] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: gr_fecs_ctxsw_mailbox_r(1) : 0x0
[ 1359.281675] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: gr_fecs_ctxsw_mailbox_r(2) : 0x50009
[ 1359.281681] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: gr_fecs_ctxsw_mailbox_r(3) : 0x20
[ 1359.281688] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: gr_fecs_ctxsw_mailbox_r(4) : 0x3ffd20
[ 1359.281694] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: gr_fecs_ctxsw_mailbox_r(5) : 0x0
[ 1359.281700] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: gr_fecs_ctxsw_mailbox_r(6) : 0x0
[ 1359.281706] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: gr_fecs_ctxsw_mailbox_r(7) : 0x0
[ 1359.281712] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: gr_fecs_engctl_r : 0x0
[ 1359.281718] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: gr_fecs_curctx_r : 0x0
[ 1359.281724] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: gr_fecs_nxtctx_r : 0x0
[ 1359.281732] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: FECS_FALCON_REG_IMB : 0xbadfbadf
[ 1359.281739] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: FECS_FALCON_REG_DMB : 0xbadfbadf
[ 1359.281745] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: FECS_FALCON_REG_CSW : 0xbadfbadf
[ 1359.281754] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: FECS_FALCON_REG_CTX : 0xbadfbadf
[ 1359.281761] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: FECS_FALCON_REG_EXCI : 0xbadfbadf
[ 1359.281768] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: FECS_FALCON_REG_PC : 0xbadfbadf
[ 1359.281776] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: FECS_FALCON_REG_SP : 0xbadfbadf
[ 1359.281783] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: FECS_FALCON_REG_PC : 0xbadfbadf
[ 1359.281789] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: FECS_FALCON_REG_SP : 0xbadfbadf
[ 1359.281797] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: FECS_FALCON_REG_PC : 0xbadfbadf
[ 1359.281804] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: FECS_FALCON_REG_SP : 0xbadfbadf
[ 1359.281811] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: FECS_FALCON_REG_PC : 0xbadfbadf
[ 1359.281818] gk20a 17000000.gp10b: gk20a_fecs_dump_falcon_stats: FECS_FALCON_REG_SP : 0xbadfbadf
[ 1359.281824] gk20a 17000000.gp10b: gk20a_fifo_handle_mmu_fault: gr_status_r : 0x1000081
[ 1359.287782] gk20a 17000000.gp10b: gk20a_fifo_set_ctx_mmu_error_tsg: TSG 5 generated a mmu fault
[ 1359.287796] gk20a 17000000.gp10b: gk20a_set_error_notifier_locked: error notifier set to 31 for ch 501
[ 1359.287857] gk20a 17000000.gp10b: fifo_error_isr: channel reset initiated from fifo_error_isr; intr=0x10000000
[ 1411.487458] compiz[2280]: unhandled level 0 translation fault (11) at 0x839fb0042c, esr 0x92000004
[ 1411.487471] pgd = ffffffc1bae58000
[ 1411.487477] [839fb0042c] *pgd=0000000000000000, *pud=0000000000000000

[ 1411.487492] CPU: 4 PID: 2280 Comm: compiz Tainted: G O 4.4.38 #70
[ 1411.487498] Hardware name: quill (DT)
[ 1411.487504] task: ffffffc1dd6b7080 ti: ffffffc1bb134000 task.ti: ffffffc1bb134000
[ 1411.487510] PC is at 0x7f9f6fcf2c
[ 1411.487514] LR is at 0x7f9f6fd060
[ 1411.487519] pc : [<0000007f9f6fcf2c>] lr : [<0000007f9f6fd060>] pstate: 80000000
[ 1411.487523] sp : 0000007fede1a490
[ 1411.487528] x29: 0000007fede1a8f0 x28: 0000000000000000
[ 1411.487535] x27: 0000000000000009 x26: 0000000000000001
[ 1411.487542] x25: 0000000001299348 x24: 0000000000557b40
[ 1411.487548] x23: 0000000000000000 x22: 0000000000000000
[ 1411.487555] x21: 0000000000557b40 x20: 0000000000002600
[ 1411.487561] x19: 0000000000002600 x18: 00000000012fa858
[ 1411.487567] x17: 0000007fadb21340 x16: 0000007f9ff03b30
[ 1411.487573] x15: 0000007fede1aac0 x14: 0000000000000000
[ 1411.487580] x13: 00000000000080e0 x12: 00000000000005cc
[ 1411.487586] x11: 0000007fede1a5c5 x10: 0000007fede1a5c6
[ 1411.487592] x9 : 0000007f9f6ff010 x8 : 0000000000000001
[ 1411.487598] x7 : 0000000000000000 x6 : 0000000000000051
[ 1411.487605] x5 : 000000004b400000 x4 : 00000000ffffffff
[ 1411.487611] x3 : 0000000000026012 x2 : 00000000012995e0
[ 1411.487617] x1 : 0000000001299348 x0 : 0000007f9fb00430

[ 1411.487632] Library at 0x7f9f6fcf2c: 0x7f9e60d000 /usr/lib/aarch64-linux-gnu/tegra/libnvidia-glcore.so.28.2.0
[ 1411.487638] Library at 0x7f9f6fd060: 0x7f9e60d000 /usr/lib/aarch64-linux-gnu/tegra/libnvidia-glcore.so.28.2.0
[ 1411.487642] vdso base = 0x7fadec5000
[ 1412.153080] bamfdaemon[1504]: unhandled level 2 translation fault (11) at 0x00000000, esr 0x92000006
[ 1412.153092] pgd = ffffffc1c4404000
[ 1412.153097] [00000000] *pgd=0000000246581003, *pud=0000000246581003, *pmd=0000000000000000

[ 1412.153115] CPU: 0 PID: 1504 Comm: bamfdaemon Tainted: G O 4.4.38 #70
[ 1412.153120] Hardware name: quill (DT)
[ 1412.153125] task: ffffffc1d48b3e80 ti: ffffffc075ed8000 task.ti: ffffffc075ed8000
[ 1412.153132] PC is at 0x7f84bcca80
[ 1412.153136] LR is at 0x0
[ 1412.153141] pc : [<0000007f84bcca80>] lr : [<0000000000000000>] pstate: 20000000
[ 1412.153145] sp : 0000007fd4594c20
[ 1412.153150] x29: 0000007fd4594c20 x28: 0000000000000000
[ 1412.153157] x27: 0000000000000003 x26: 00000000fffffffc
[ 1412.153165] x25: 00000000ffffffff x24: 0000000000000000
[ 1412.153171] x23: 0000000000673680 x22: 0000000000000000
[ 1412.153178] x21: 0000000000000001 x20: 0000000000000001
[ 1412.153185] x19: 0000000000000004 x18: 0000000000000000
[ 1412.153191] x17: 0000007f847ee3b0 x16: 0000000000000000
[ 1412.153198] x15: 0000000000000000 x14: 0000000000000000
[ 1412.153204] x13: 0000000000000000 x12: 0000000000400000
[ 1412.153211] x11: 00000000fffffffc x10: 0000000000673680
[ 1412.153218] x9 : 00000000ffffffff x8 : 0000000000000000
[ 1412.153224] x7 : 000000000067a070 x6 : 0000000000000000
[ 1412.153231] x5 : 0000000000000000 x4 : 0000000000000000
[ 1412.153237] x3 : 0000000000000000 x2 : 0000000000000002
[ 1412.153243] x1 : 0000000000000002 x0 : 00000000003fc000

[ 1412.153258] Library at 0x7f84bcca80: 0x7f84bb7000 /usr/lib/aarch64-linux-gnu/libgdk_pixbuf-2.0.so.0.3200.2
[ 1412.153264] Library at 0x0: 0x400000 /usr/lib/aarch64-linux-gnu/bamf/bamfdaemon
[ 1412.153268] vdso base = 0x7f8557c000

Hi!
We have the same issue on our custom board.
Processing with CUDA and when board heats.
Seems that there is no problem when board is cold. Only if it heats…

Also it seems that if you could run application with cuda and heat the board after - it would be ok.

Did you find the solution?

Hi,

The error looks similar to this topic:
https://devtalk.nvidia.com/default/topic/1030869/jetson-tx2/crosscompile-ok-run-error-/

Could you try if the suggestion from linuxdev helps on your use case first?

Thanks.