[com COM18] (2024-08-26_101738) COM18 (USB-SERIAL CH340 (COM18)).log (230.1 KB)
kern.log (68.2 KB)
nvmap_trace.zip (2.1 MB)
test_shell.txt (243 Bytes)
hello, we build /usr/local/cuda-11.4/samples/5_Simulations/oceanFFT/oceanFFT.cpp demo example, get oceanFFT bin file, then we use test_shell.txt shell script file test cuda gpu, we found orin console[file-CH340-COM18.log] so fast get crash, and orin will auto-reboot. nvmap_trace.zip file is we captured the nvmap trace event log, kern.log is the /var/log/kern.log. we use jetpack 5.1.1. we want know, What is the reason make this kernel crash(and auto-reboot), and how to fix it?
Hi,
Do you run multiple oceanFFT at the same time?
If yes, do you meet the same issue when running single oceanFFT app?
Thanks.
thanks. yes, we run multiple oceanFFT at the same time. if we test running single oceanFFT app, we can not get this same issue so fast.
Hi,
Do you mean the crash still occur with single oceanFFT but just requires more time?
Thanks.
Hi,
we do not test use single oceanFFT. if you want, we can try it
Hi,
we test use multiple oceanFFT at the same time, because of we have a ros application use cuda api “gpuConvertUYVYtoBGR((uint8_t *)p, _pCudaOutBuffer, _width, _height)”, and run multiple our ros application at the same time. at this conditon(use our ros application), we can get like this crash, and orin will auto-reboot.but we can not get this same issue so fast(maybe one day or more day get this crash). today, we found use multiple oceanFFT at the same time can get this crash so fast, so we want know, What is the reason make this kernel crash(and auto-reboot), and how to fix it? and i think maybe can fix same with our ros application.
Thanks.
Hi,
We are trying to reproduce this issue in our environment.
Will provide more info to you later.
Thanks.
Hi,
How long it take to reproduce this issue?
We try the oceanFFT sample around a hour and work correctly (num up to 346).
Thanks.
Hi,
We are tested use a 4k dispaly screen(hdmi dispaly mode), num range between 100 and 200, will reproduce this issue.
Thanks
Hi,
Is upgrading to JetPack 6 an option for you?
We tested it on Orin with JetPack 6 and were not able to reproduce this issue.
So it’s recommended to give it a try.
Thanks.
Hi, AastaLLL
Our project is based on JetPack5.1.1. upgrade to JetPack6, we need change our own driver, rootfs and applications source code, adapt to the JetPack6 new kernel, rootfs, and api, It takes a lot of time
so can you restore the firmware to jetpack 5.1.1, and check this issue.
Help us…
Thanks.
Hi,
We can check this issue on JetPack 5.1.1 again.
Will provide more info to you later.
Thanks.
Hi,
We test this issue with JetPack 5.1.1 for 3 hours (num=500).
But the apps run well without issue.
Thanks.
Hi, AastaLLL
Did you test with a 4k display screen(hdmi display mode)? and can you tell us, your test conditions, we use your conditions test again.
Thanks.
Hi,
We tested it again and still cannot reproduce the issue.
Both experiments connected with a 4K display.
The main difference is that we test this under maximum performance.
$ sudo nvpmodel -m 0
$ sudo jetson_clocks
Then run the test_shell you shared above.
Do you also test it under the maximum performance?
If not, could you give it a try?
Thanks.
Hi, AastaLLL
Now, We just config nvpmodel to mode 0, but not config jetson_clocks(cpu and gpu performance to maximum ). We will try to config “nvpmodel -m 0” and “jetson_clocks”, and at this condition test it again.
Thanks.
Hi,AastaLLL
We used “jetson_clocks” and “nvpmodel -m 0” test at hdmi 4k display, still can reproduce this issue. which connector type of your screen(hdmi or dp)? which hardware platform are you used? Is it agx orin devkit?
Thanks.
Hi,
Yes, we test this with Orin devkit + HDMI DP.
We will discuss this issue internally and share more info with you.
Thanks.