On Orin Nano 4GB the following is a reserved region by default, ostensibly for “nvgpu comptags”
reserved-memory {
linux,cma { /* Needed for nvgpu comptags */
compatible = "shared-dma-pool";
reusable;
size = <0x0 0x10000000>; /* 256MB */
alignment = <0x0 0x10000>;
linux,cma-default;
status = "okay";
};
};
Looking at the cbc_status file, it looks like only ~15 MB is being allocated for the comptag store. This makes sense for our system memory capacity based on my reading of the code.
$ cat /sys/kernel/debug/17000000.gpu/cbc_status
cbc.compbit_backing_size: 14876672
cbc.comptags_per_cacheline: 32
cbc.gobs_per_comptagline_per_slice: 16
cbc.max_comptag_lines: 58000
cbc.comp_tags.size: 57999
cbc.compbit_store.base_hw: 1365504
cbc.compbit_store.mem.aperture: 1
cbc.compbit_store.mem.size: 14942208
cbc.compbit_store.mem.aligned_size: 14942208
cbc.compbit_store.mem.gpu_va: 0
cbc.compbit_store.mem.skip_wmb: 0
cbc.compbit_store.mem.free_gpu_va: 0
cbc.compbit_store.mem.mem_flags: 0
cbc.compbit_store.mem.cpu_va: ffff0000cd600000
cbc.compbit_store.mem.pa: 14d600000
However, when reducing the shared-dma-pool reserved region size, I encounter the following error on boot:
[ 6.211899] nvgpu: 17000000.gpu ga10b_fb_cbc_configure:100 [ERR] CBC marked unsafe by MMU, check cbc config
[ 6.225275] nvgpu: 17000000.gpu nvgpu_timeout_expired_msg_retry:118 [ERR] No more retries @ tu104_cbc_ctrl+0x21c/0x440 [nvgpu]
[ 6.225282] nvgpu: 17000000.gpu tu104_cbc_ctrl:192 [ERR] comp tag clear timeout
What am I missing here? Further, for compute only (no graphics) workloads is it reasonable to disable CONFIG_NVGPU_COMPRESSION
entirely?
Even with this error on boot, GPU workloads still run and I don’t recall encountering any performance issues, though I should retest.