Cuda hangs after installation of jetpack and reboot

I tried to turn off ASPM How to edit kernel's command line - #7 by linuxdev

Cuda still hangs after reboot, there is no AER error , but there are other errors
dmesg_aspmoff.txt (159.2 KB)
example:

[   14.913957] nvgpu: 17000000.gv11b        gk20a_gr_handle_fecs_error:5281 [ERR]  fecs watchdog triggered for channel 511, cannot ctxsw anymore !!
[   14.914220] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:129  [ERR]  gr_fecs_os_r : 0
[   14.914376] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:131  [ERR]  gr_fecs_cpuctl_r : 0x40
[   14.914563] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:133  [ERR]  gr_fecs_idlestate_r : 0x1
[   14.914736] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:135  [ERR]  gr_fecs_mailbox0_r : 0x0
[   14.914904] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:137  [ERR]  gr_fecs_mailbox1_r : 0x0
[   14.915069] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:139  [ERR]  gr_fecs_irqstat_r : 0x0
[   14.915258] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:141  [ERR]  gr_fecs_irqmode_r : 0x4
[   14.915987] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:143  [ERR]  gr_fecs_irqmask_r : 0x8705
[   14.916713] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:145  [ERR]  gr_fecs_irqdest_r : 0x0
[   14.920287] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:147  [ERR]  gr_fecs_debug1_r : 0x40
[   14.929731] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:149  [ERR]  gr_fecs_debuginfo_r : 0x0
[   14.939351] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:151  [ERR]  gr_fecs_ctxsw_status_1_r : 0x980
[   14.949336] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:155  [ERR]  gr_fecs_ctxsw_mailbox_r(0) : 0x1
[   14.959504] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:155  [ERR]  gr_fecs_ctxsw_mailbox_r(1) : 0x0
[   14.970013] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:155  [ERR]  gr_fecs_ctxsw_mailbox_r(2) : 0x90009
[   14.980887] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:155  [ERR]  gr_fecs_ctxsw_mailbox_r(3) : 0x0
[   14.990739] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:155  [ERR]  gr_fecs_ctxsw_mailbox_r(4) : 0x1ffda0
[   15.001736] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:155  [ERR]  gr_fecs_ctxsw_mailbox_r(5) : 0x0
[   15.011899] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:155  [ERR]  gr_fecs_ctxsw_mailbox_r(6) : 0x15
[   15.022146] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:155  [ERR]  gr_fecs_ctxsw_mailbox_r(7) : 0x0
[   15.032613] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:155  [ERR]  gr_fecs_ctxsw_mailbox_r(8) : 0x0
[   15.042818] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:155  [ERR]  gr_fecs_ctxsw_mailbox_r(9) : 0x0
[   15.052955] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:155  [ERR]  gr_fecs_ctxsw_mailbox_r(10) : 0x0
[   15.063170] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:155  [ERR]  gr_fecs_ctxsw_mailbox_r(11) : 0x0
[   15.073830] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:155  [ERR]  gr_fecs_ctxsw_mailbox_r(12) : 0x0
[   15.084174] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:155  [ERR]  gr_fecs_ctxsw_mailbox_r(13) : 0x3fffffff
[   15.095230] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:155  [ERR]  gr_fecs_ctxsw_mailbox_r(14) : 0x0
[   15.105461] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:155  [ERR]  gr_fecs_ctxsw_mailbox_r(15) : 0x0
[   15.116023] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:159  [ERR]  gr_fecs_engctl_r : 0x0
[   15.125412] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:161  [ERR]  gr_fecs_curctx_r : 0x0
[   15.134966] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:163  [ERR]  gr_fecs_nxtctx_r : 0x0
[   15.144218] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:169  [ERR]  FECS_FALCON_REG_IMB : 0x0
[   15.153811] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:175  [ERR]  FECS_FALCON_REG_DMB : 0x0
[   15.163165] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:181  [ERR]  FECS_FALCON_REG_CSW : 0x110800
[   15.173197] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:187  [ERR]  FECS_FALCON_REG_CTX : 0x0
[   15.182716] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:193  [ERR]  FECS_FALCON_REG_EXCI : 0x0
[   15.192729] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:200  [ERR]  FECS_FALCON_REG_PC : 0x51c4
[   15.202499] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:206  [ERR]  FECS_FALCON_REG_SP : 0x1f44
[   15.212171] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:200  [ERR]  FECS_FALCON_REG_PC : 0x51c8
[   15.222287] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:206  [ERR]  FECS_FALCON_REG_SP : 0x1f48
[   15.232109] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:200  [ERR]  FECS_FALCON_REG_PC : 0x62
[   15.241738] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:206  [ERR]  FECS_FALCON_REG_SP : 0x1f48
[   15.251516] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:200  [ERR]  FECS_FALCON_REG_PC : 0x51c4
[   15.261290] nvgpu: 17000000.gv11b      gk20a_fecs_dump_falcon_stats:206  [ERR]  FECS_FALCON_REG_SP : 0x1f48
[   17.287905] nvgpu: 17000000.gv11b   nvgpu_set_error_notifier_locked:137  [ERR]  error notifier set to 8 for ch 511
[   17.288168] nvgpu: 17000000.gv11b   gv11b_fifo_handle_ctxsw_timeout:1611 [ERR]  ctxsw timeout error: active engine id =0, tsg=0, info: awaiting ack ms=3100
[   17.288548] ---- mlocks ----