Hi
I have found a very annoying problem on the drive os(QNX 5.1.9.0), while I am running my application on it, which use Tensorrt as the inference engine, it behaves as below:
At first it has more than 20G available memory to be used.
After I start my application and run for some time, then stop it.
Even I stop my application, I use top to check the system available memory, it decreased(About 2~4G).
Then I do some round of my test on my application start/run/stop. At last the system available memory decreased to about 4G, then I can never start my application for insufficient memory reason. Currently only restarting the Xavia box can workaround it.
So my doubt is how can I reclaim the QNX system memory after my application stopped, is there some system bug for drive os(QNX).
below are all my system processes status report by hogs:
PID NAME MSEC PIDS SYS MEMORY
172044 io-nvtime 1 0% 0% 264k 0%
8194 devv-nvhv 1 0% 0% 148k 0%
24579 io-nvbootprofi 1 0% 0% 172k 0%
4100 devv-nvivc 1 0% 0% 152k 0%
20485 nvpm_server 1 0% 0% 148k 0%
28678 slogger2 1 0% 0% 256k 0%
4103 io-nvsciipc 1 0% 0% 872k 0%
28680 dumper 1 0% 0% 484k 0%
28681 pipe 1 0% 0% 280k 0%
860170 devm-nvcap 1 0% 0% 740k 0%
36875 io-nvbpmpivc 1 0% 0% 284k 0%
208919 io-nvgpio 1 0% 0% 432k 0%
143373 clock_init 1 0% 0% 184k 0%
639060 devc-nvtipc 1 0% 0% 164k 0%
94223 io-nvclock 1 0% 0% 312k 0%
6963216 sshd 1 0% 0% 540k 0%
290833 nvsplash 1 0% 0% 216k 0%
94226 io-nvpowergate 1 0% 0% 412k 0%
151571 devc-ser8250-t 1 0% 0% 228k 0%
49172 io-nvmpalloc 1 0% 0% 172k 0%
49173 io-nvsyseventc 1 0% 0% 260k 0%
57366 io-nvsku 1 0% 0% 188k 0%
655447 io-nvthermmon 1 0% 0% 3664k 3%
643160 devc-ser8250-t 1 0% 0% 232k 0%
221209 io-nvi2c 1 0% 0% 324k 0%
221210 io-nvi2c 1 0% 0% 228k 0%
221211 io-nvi2c 1 0% 0% 324k 0%
221212 i2c-tegra 1 0% 0% 164k 0%
221213 io-nvi2c 1 0% 0% 228k 0%
434206 io-nvsys 1 0% 0% 3592k 3%
315423 syslogd 1 0% 0% 208k 0%
340000 io-nvdt 1 0% 0% 1176k 1%
393249 automount 1 0% 0% 196k 0%
1929250 dhcp.client 1 0% 0% 228k 0%
434211 devb-nvhvblk 1 0% 0% 708k 0%
1044516 io-pkt-v6-hc 1 0% 0% 2476k 2%
655450 io-nvspi 1 0% 0% 236k 0%
364582 io-nvthermmon 1 0% 0% 3656k 3%
6967335 -sh 1 0% 0% 180k 0%
372776 io-nvpmu 1 0% 0% 3812k 3%
495657 io-usb-otg 1 0% 0% 1660k 1%
450602 lanemux_pad_in 1 0% 0% 3568k 3%
450603 devb-loopback 1 0% 0% 732k 0%
450604 devb-nvhvblk 1 0% 0% 708k 0%
466989 usb_init 1 0% 0% 3560k 3%
499758 devb-loopback 1 0% 0% 732k 0%
3358767 sshd 1 0% 0% 716k 0%
466992 devb-loopback 1 0% 0% 1896k 1%
479281 io-nv_pcie_man 1 0% 0% 3656k 3%
466994 devb-nvhvblk 1 0% 0% 708k 0%
495667 devb-loopback 1 0% 0% 732k 0%
495668 devb-nvhvblk 1 0% 0% 708k 0%
495669 devb-loopback 1 0% 0% 732k 0%
495670 devb-nvhvblk 1 0% 0% 708k 0%
499767 devb-nvhvblk 1 0% 0% 708k 0%
516152 devb-loopback 1 0% 0% 732k 0%
516153 devb-nvhvblk 1 0% 0% 708k 0%
532538 devb-loopback 1 0% 0% 1936k 2%
1851451 io-pkt-v6-hc 1 0% 0% 2476k 2%
3362876 -sh 1 0% 0% 248k 0%
925768 qconn 1 0% 0% 244k 0%
1044542 dhcp.client 1 0% 0% 228k 0%
1318975 dhcp.client 1 0% 0% 228k 0%
3608640 sshd 1 0% 0% 604k 0%
643137 io-nvspi 1 0% 0% 236k 0%
770114 sh 1 0% 0% 180k 0%
647235 devc-nvcan 1 0% 0% 3644k 3%
1822788 fanctrl 1 0% 0% 3720k 3%
884805 devc-pty 1 0% 0% 216k 0%
3612742 -sh 1 0% 0% 216k 0%
880711 sshd 1 0% 0% 464k 0%
655453 devc-nvcan 1 0% 0% 3644k 3%
614473 pci-server 1 0% 0% 1064k 1%
565322 devb-umass 1 0% 0% 200k 0%
589901 devm-nvisc 1 0% 0% 324k 0%
639056 devc-nvvse 1 0% 0% 3832k 3%
794705 inetd 1 0% 0% 192k 0%
602195 random 1 0% 0% 184k 0%
7184421 hogs 3 0% 0% 148k 0%
36878 NvGuard_Layer0 4 0% 0% 380k 0%
184344 devg-nvrm 4 0% 0% 14816k 15%
729149 io-pkt-v6-hc 7 0% 0% 5424k 5%
1 procnto-smp-in 8 0% 0% 0k 0%
0 [idle] 2989 14% 99% 0k 0%
1 [idle] 3008 14% 100% 0k 0%
2 [idle] 3010 14% 100% 0k 0%
3 [idle] 3011 14% 100% 0k 0%
4 [idle] 3010 14% 100% 0k 0%
5 [idle] 3008 14% 100% 0k 0%
6 [idle] 3011 14% 100% 0k 0%
and below is my system state(memory/cpu) reported by top:
83 processes; 445 threads;
CPU states: 0.1% user, 0.1% kernel
CPU 0 Idle: 98.9%
CPU 1 Idle: 99.2%
CPU 2 Idle: 99.9%
CPU 3 Idle: 100.0%
CPU 4 Idle: 100.0%
CPU 5 Idle: 100.0%
CPU 6 Idle: 100.0%
Memory: 29622M total, 2706M avail, page size 4K
PID TID PRI STATE HH:MM:SS CPU COMMAND
1 16 10 Run 0:00:32 0.10% kernel
7192613 1 10 Rply 0:00:00 0.05% top
729149 2 21 Rcv 0:14:33 0.03% io-pkt-v6-hc
36878 4 10 Rcv 0:05:38 0.01% NvGuard_Layer0_GOS1
184344 56 21 Intr 0:02:58 0.00% devg-nvrm
1851451 2 21 Rcv 0:01:46 0.00% io-pkt-v6-hc
184344 61 10 CdV 0:01:36 0.00% devg-nvrm
1044516 2 21 Rcv 0:02:07 0.00% io-pkt-v6-hc
172044 4 10 Rcv 0:00:50 0.00% io-nvtime
1929250 1 10 SigW 0:00:02 0.00% dhcp.client
Min Max Average
CPU 0 idle: 98% 98% 98%
CPU 1 idle: 99% 99% 99%
CPU 2 idle: 99% 99% 99%
CPU 3 idle: 100% 100% 100%
CPU 4 idle: 100% 100% 100%
CPU 5 idle: 100% 100% 100%
CPU 6 idle: 100% 100% 100%
Mem Avail: 2706MB 2706MB 2706MB
Processes: 83 83 83
Threads: 445 445 445
I wonder, where my system memory goes, since all the processes do not occupy that much memory.
VickNV
March 6, 2020, 1:57pm
3
Hi wenjun.huang,
The forum covers only DRIVE Software support (only Linux included). I assume you get DRIVE OS QNX access via your nvonline (https://partners.nvidia.com/ ) account so you should file a bug there. Your request will get handled then. Thanks!
The truth is that the core dump files under the memory file system /dev/shmem/ that occupied the system memory.
rm /shmem/*core
will solve my problem.