NVGPU Exception at boot? Xavier NX

Message during boot looks like a kernel oops but is not:

Alloc smaller than page size! (128)!
[ 5169.922300] ------------[ cut here ]------------
[ 5169.922707] WARNING: CPU: 3 PID: 17375 at nvidia/nvgpu/drivers/gpu/nvgpu/os/linux/kmem.c:315 __nvgpu_check_valloc_size.part.0+0x24/0x40 [nvgpu]
[ 5169.922927] Modules linked in: ar0521(O) xt_nat xt_tcpudp veth xt_conntrack ipt_MASQUERADE nf_nat_masquerade_ipv4 nfnetlink xt_addrtype iptable_filter iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack ip_tables x_tables br_netfilter overlay inv_mpu6050_spi inv_mpu6050 spidev nvgpu fuse [last unloaded: ar0521]

[ 5169.923650] CPU: 3 PID: 17375 Comm: pool-BirdsEyeVi Tainted: G        W  O    4.9.253-l4t-r32.6+g76839fee86ee #1
[ 5169.923823] Hardware name: NVIDIA Jetson Xavier NX Developer Kit (DT)
[ 5169.923937] task: ffffffc1408b5400 task.stack: ffffffc11d38c000
[ 5169.924236] PC is at __nvgpu_check_valloc_size.part.0+0x24/0x40 [nvgpu]
[ 5169.924730] LR is at __nvgpu_check_valloc_size.part.0+0x24/0x40 [nvgpu]
[ 5169.925039] pc : [<ffffff8000f69ef4>] lr : [<ffffff8000f69ef4>] pstate: 40400145
[ 5169.929107] sp : ffffffc11d38f930
[ 5169.932518] x29: ffffffc11d38f930 x28: ffffffc162e67c00
[ 5169.938289] x27: ffffff8000f7c480 x26: ffffffc11d38fba8
[ 5169.943275] x25: ffffffc162e67c80 x24: ffffff8000f7c458
[ 5169.948612] x23: 0000000000400000 x22: ffffff8000edc8ec
[ 5169.954212] x21: ffffffc1ef000000 x20: ffffff800be17000
[ 5169.959990] x19: 0000000000000080 x18: 0000000000000000
[ 5169.965510] x17: 0000007f7b183d80 x16: ffffff8008273c40
[ 5169.971200] x15: 0000000000000010 x14: ffffffffffffffff
[ 5169.976886] x13: ffffff808a3185cf x12: ffffff800a3185d7
[ 5169.982485] x11: 0000000000000000 x10: ffffffc11d38f5f0
[ 5169.988341] x9 : 00000000000009a8 x8 : ffffff80083d38f0
[ 5169.994113] x7 : ffffff800a0e46a8 x6 : ffffffc1ffd5fbf0
[ 5169.999378] x5 : ffffffc1ffd5fbf0 x4 : 0000000000000000
[ 5170.004964] x3 : ffffffc1ffd657f8 x2 : ffffffc1ffd5fbf0
[ 5170.010047] x1 : ffffffc1408b5400 x0 : 0000000000000024

[ 5170.016780] ---[ end trace 28c5aa77aa3dece7 ]---
[ 5170.021244] Call trace:
[ 5170.023889] [<ffffff8000f69ef4>] __nvgpu_check_valloc_size.part.0+0x24/0x40 [nvgpu]
[ 5170.031302] [<ffffff8000eabd1c>] __nvgpu_track_vzalloc+0x44/0x98 [nvgpu]
[ 5170.037429] [<ffffff8000eabda4>] __nvgpu_vzalloc+0x34/0x80 [nvgpu]
[ 5170.043554] [<ffffff8000edc914>] __set_pd_level+0x204/0x3b0 [nvgpu]
[ 5170.049240] [<ffffff8000edccfc>] __nvgpu_gmmu_update_page_table+0x23c/0x4f8 [nvgpu]
[ 5170.056413] [<ffffff8000edd2c4>] gk20a_locked_gmmu_map+0x114/0x278 [nvgpu]
[ 5170.063065] [<ffffff8000edc4e0>] __nvgpu_gmmu_map+0xc8/0x140 [nvgpu]
[ 5170.068930] [<ffffff8000edd018>] nvgpu_gmmu_map+0x60/0x78 [nvgpu]
[ 5170.075060] [<ffffff8000f48da8>] set_syncpt_ro_map_gpu_va_locked+0x60/0xa0 [nvgpu]
[ 5170.082138] [<ffffff8000f4c01c>] gv11b_fifo_get_sync_ro_map+0x44/0x80 [nvgpu]
[ 5170.089241] [<ffffff8000eb05c0>] gk20a_as_dev_ioctl+0x238/0xa60 [nvgpu]
[ 5170.095455] [<ffffff8008273408>] do_vfs_ioctl+0xb0/0x8e8
[ 5170.100876] [<ffffff8008273cd0>] SyS_ioctl+0x90/0xa8
[ 5170.106122] [<ffffff800808395c>] __sys_trace_return+0x0/0x4

A look at the source code reveals that function atop the call stack only does WARN that is one line of code

static void __nvgpu_check_valloc_size(unsigned long size)
{
        WARN(size < PAGE_SIZE, "Alloc smaller than page size! (%lu)!\n", size);
}

am not sure how to prevent this and/or should I simply ignore it

Thanks

We seldom check nvgpu issue because it should not happen in general usecase.

If you are using custom board, put your module to NV devkit and flash with sdkmanager again.

If issue is even there with NV devkit, then it could be hardware issue.

If it does not, then we need to know what customization you did on your side that causes this issue.

I have not customised the kernel at all as far as it comes to page size. The warning does not appear when the same board is booted off Ubuntu image.

We take it to mean that the memory allocation is not optimal at that point but the warning only happens once at boot time. The nvgpu driver code was not modified in any way whatsoever

What does this mean?

The warning does not appear when the same board is booted off Ubuntu image.

You are using something that is not ubuntu?

Am so sorry, yes it’s a Yocto build

Should I file a bug report ? I need to get to the bottom of this. Was this solved in more recent versions of meta-tegra???

Am using LinuxTegra release R32.6.1
JetPack release 4.6

Just to clarify that the meta-tegra is not maintained by NV team directly.

What you can do now is try to use latest jetpack + pure image from sdkmanager on your board and see if you can reproduce that.

But actually I don’t think you would reproduce that with above setup.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.