Good day,
We are developing hardware, based on the NVIDIA JETSON XAVIER NX platform.
We are using:
- Core from git://nv-tegra.nvidia.com/linux-4.9.git l4t/l4t-r32.5-4.9.
- Rootfs from https://developer.nvidia.com/embedded/L4T/r32_Release_v5.0/T186/Tegra186_Linux_R32.5.0_aarch64.tbz2
We have placed NVIDIA Rootfs into /nvidia.
We have installed nvidia-cuda, libcudnn8, nvidia-cudnn8, cuda-samples-1 into NVIDIA Rootfs.
We are trying to run /usr/local/cuda/samples/0_Simple/matrixMul sample, with the following pathes:
-
Path 1:
1.1. Run the following command:env LD_LIBRARY_PATH=/nvidia/lib/aarch64-linux-gnu/:/nvidia/usr/local/cuda-10.2/targets/aarch64-linux/lib:/nvidia/usr/lib/aarch64-linux-gnu:/nvidia/usr/lib/aarch64-linux-gnu/tegra
/nvidia/lib/aarch64-linux-gnu/ld-2.27.so ./matrixMulThe NVIDIA matrixMul became hanged. In the console we observe the following log:
[root@P80092 ~]# dmesg | grep nvgpu
[ 185.668129] nvgpu: 17000000.gv11b __nvgpu_timeout_expired_msg_cpu:94 [ERR] Timeout detected @ gr_gk20a_ctx_wait_ucode+0xa4/0x3a8 [nvgpu]
[ 185.668383] nvgpu: 17000000.gv11b gr_gk20a_ctx_wait_ucode:528 [ERR] timeout waiting on mailbox=0 value=0x00000000
[ 185.668575] nvgpu: 17000000.gv11b gk20a_fecs_dump_falcon_stats:129 [ERR] gr_fecs_os_r : 0
[ 185.668724] nvgpu: 17000000.gv11b gk20a_fecs_dump_falcon_stats:131 [ERR] gr_fecs_cpuctl_r : 0x60
[ 185.668880] nvgpu: 17000000.gv11b gk20a_fecs_dump_falcon_stats:133 [ERR] gr_fecs_idlestate_r : 0x0
[ 185.669045] nvgpu: 17000000.gv11b gk20a_fecs_dump_falcon_stats:135 [ERR] gr_fecs_mailbox0_r : 0x0
[ 185.669237] nvgpu: 17000000.gv11b gk20a_fecs_dump_falcon_stats:137 [ERR] gr_fecs_mailbox1_r : 0x0
[ 185.669567] nvgpu: 17000000.gv11b gk20a_fecs_dump_falcon_stats:139 [ERR] gr_fecs_irqstat_r : 0x0
[ 185.670297] nvgpu: 17000000.gv11b gk20a_fecs_dump_falcon_stats:141 [ERR] gr_fecs_irqmode_r : 0x4
[ 185.671018] nvgpu: 17000000.gv11b gk20a_fecs_dump_falcon_stats:143 [ERR] gr_fecs_irqmask_r : 0x8705
[ 185.676410] nvgpu: 17000000.gv11b gk20a_fecs_dump_falcon_stats:145 [ERR] gr_fecs_irqdest_r : 0x0
[ 185.685609] nvgpu: 17000000.gv11b gk20a_fecs_dump_falcon_stats:147 [ERR] gr_fecs_debug1_r : 0x40
[ 185.695374] nvgpu: 17000000.gv11b gk20a_fecs_dump_falcon_stats:149 [ERR] gr_fecs_debuginfo_r : 0x0
[ 185.704677] nvgpu: 17000000.gv11b gk20a_fecs_dump_falcon_stats:151 [ERR] gr_fecs_ctxsw_status_1_r : 0x180
[ 185.714843] nvgpu: 17000000.gv11b gk20a_fecs_dump_falcon_stats:155 [ERR] gr_fecs_ctxsw_mailbox_r(0) : 0x0
[ 185.724808] nvgpu: 17000000.gv11b gk20a_fecs_dump_falcon_stats:155 [ERR] gr_fecs_ctxsw_mailbox_r(1) : 0x0
[ 185.735056] nvgpu: 17000000.gv11b gk20a_fecs_dump_falcon_stats:155 [ERR] gr_fecs_ctxsw_mailbox_r(2) : 0x1
[ 185.745361] nvgpu: 17000000.gv11b gk20a_fecs_dump_falcon_stats:155 [ERR] gr_fecs_ctxsw_mailbox_r(3) : 0x0
[ 185.755791] nvgpu: 17000000.gv11b gk20a_fecs_dump_falcon_stats:155 [ERR] gr_fecs_ctxsw_mailbox_r(4) : 0x0
[ 185.766033] nvgpu: 17000000.gv11b gk20a_fecs_dump_falcon_stats:155 [ERR] gr_fecs_ctxsw_mailbox_r(5) : 0x0
[ 185.776181] nvgpu: 17000000.gv11b gk20a_fecs_dump_falcon_stats:155 [ERR] gr_fecs_ctxsw_mailbox_r(6) : 0x0
[ 185.786409] nvgpu: 17000000.gv11b gk20a_fecs_dump_falcon_stats:155 [ERR] gr_fecs_ctxsw_mailbox_r(7) : 0x0
[ 185.796428] nvgpu: 17000000.gv11b gk20a_fecs_dump_falcon_stats:155 [ERR] gr_fecs_ctxsw_mailbox_r(8) : 0x0
[ 185.806627] nvgpu: 17000000.gv11b gk20a_fecs_dump_falcon_stats:155 [ERR] gr_fecs_ctxsw_mailbox_r(9) : 0x0
[ 185.816861] nvgpu: 17000000.gv11b gk20a_fecs_dump_falcon_stats:155 [ERR] gr_fecs_ctxsw_mailbox_r(10) : 0x0
[ 185.827368] nvgpu: 17000000.gv11b gk20a_fecs_dump_falcon_stats:155 [ERR] gr_fecs_ctxsw_mailbox_r(11) : 0x0
[ 185.838158] nvgpu: 17000000.gv11b gk20a_fecs_dump_falcon_stats:155 [ERR] gr_fecs_ctxsw_mailbox_r(12) : 0x0
[ 185.848649] nvgpu: 17000000.gv11b gk20a_fecs_dump_falcon_stats:155 [ERR] gr_fecs_ctxsw_mailbox_r(13) : 0x7fffffff
[ 185.859640] nvgpu: 17000000.gv11b gk20a_fecs_dump_falcon_stats:155 [ERR] gr_fecs_ctxsw_mailbox_r(14) : 0x0
[ 185.870143] nvgpu: 17000000.gv11b gk20a_fecs_dump_falcon_stats:155 [ERR] gr_fecs_ctxsw_mailbox_r(15) : 0x0
[ 185.880373] nvgpu: 17000000.gv11b gk20a_fecs_dump_falcon_stats:159 [ERR] gr_fecs_engctl_r : 0x0
[ 185.890147] nvgpu: 17000000.gv11b gk20a_fecs_dump_falcon_stats:161 [ERR] gr_fecs_curctx_r : 0x0
[ 185.899362] nvgpu: 17000000.gv11b gk20a_fecs_dump_falcon_stats:163 [ERR] gr_fecs_nxtctx_r : 0x0
[ 185.908381] nvgpu: 17000000.gv11b gk20a_fecs_dump_falcon_stats:169 [ERR] FECS_FALCON_REG_IMB : 0x0
[ 185.918180] nvgpu: 17000000.gv11b gk20a_fecs_dump_falcon_stats:175 [ERR] FECS_FALCON_REG_DMB : 0x0
[ 185.927801] nvgpu: 17000000.gv11b gk20a_fecs_dump_falcon_stats:181 [ERR] FECS_FALCON_REG_CSW : 0x110804
[ 185.937944] nvgpu: 17000000.gv11b gk20a_fecs_dump_falcon_stats:187 [ERR] FECS_FALCON_REG_CTX : 0x0
[ 185.947416] nvgpu: 17000000.gv11b gk20a_fecs_dump_falcon_stats:193 [ERR] FECS_FALCON_REG_EXCI : 0x0
[ 185.957020] nvgpu: 17000000.gv11b gk20a_fecs_dump_falcon_stats:200 [ERR] FECS_FALCON_REG_PC : 0x592e
[ 185.966916] nvgpu: 17000000.gv11b gk20a_fecs_dump_falcon_stats:206 [ERR] FECS_FALCON_REG_SP : 0x1fe0
[ 185.976534] nvgpu: 17000000.gv11b gk20a_fecs_dump_falcon_stats:200 [ERR] FECS_FALCON_REG_PC : 0x592e
[ 185.986763] nvgpu: 17000000.gv11b gk20a_fecs_dump_falcon_stats:206 [ERR] FECS_FALCON_REG_SP : 0x1fe0
[ 185.996325] nvgpu: 17000000.gv11b gk20a_fecs_dump_falcon_stats:200 [ERR] FECS_FALCON_REG_PC : 0x592e
[ 186.006389] nvgpu: 17000000.gv11b gk20a_fecs_dump_falcon_stats:206 [ERR] FECS_FALCON_REG_SP : 0x1fe0
[ 186.016261] nvgpu: 17000000.gv11b gk20a_fecs_dump_falcon_stats:200 [ERR] FECS_FALCON_REG_PC : 0x592e
[ 186.026233] nvgpu: 17000000.gv11b gk20a_fecs_dump_falcon_stats:206 [ERR] FECS_FALCON_REG_SP : 0x1fe0 -
Path 2:
2.1. Run chroot /nvidia
2.2. Run samplePath 2 was completed without issues.
-
Path 3:
3.1. Run Path 2.
3.2. Exit from chroot
3.3. Run the same as in Path 1.Path 3, also completed without issues.
Please, help us to understand what we should change in our system setup
in order to use samples, without prior running them in chroot.
–
Sincererly yours,
Yuriy