Jetson Orin Nano's RAM keeps getting full, the board crashes

carbunescu_maeva · January 23, 2025, 10:25am

Hello,

I have a JETSON ORIN NANO Developer Kit 8GB and my goal is to implement Real Time Yolo Object Detection, with rtsp source from an IP Camera.
I have exported my custom trained yolov11 model to TensorRT format and I am using the ultralytics library and yolo predict cfg cli command to run it, and there is no problem with the inference. However, using the ‘free’ command, I have discovered that the used RAM it’s constantly increasing (while the free one it’s decreasing), causing the board to crash.
I have tried using Stream=True parameter and the free RAM it’s still decreasing, but only slower. The program should run non-stop and save the inference results, therefore I need a long-term solution.
As for my current environment, I am using:

Jetpack 6.1 with cuda 12.6, cuda driver version 540.4.0, libcudnn 9.5
TensorRT 10.3.0
Torch 2.5.0
Torch 0.20.0
Ultralytics 8.3.64

foxsquirrel1 · January 23, 2025, 2:21pm

Not sure if you have a leak or not.
Problem is you are asking alot from 8gb of ram.
Set your swap file up on the NVMe to at least 60Gb and see what happens. If you are still running off SD card don’t use the large swap file.
Also install jtop and use gnome system monitor, those are not tools used for finding leaks.

If it is a leak you don’t have a chance to even come close to finding it due to the complexity and interactions between the packages.

AastaLLL · January 24, 2025, 3:18am

Hi,

Do you see the RAM usage increases after the first frame inference?

If so, it’s recommended to check your app as there might be some memory leakage.
Have you checked the YOLOv11 detection sample from Ultralytics to see if any issues?

Thanks.

carbunescu_maeva · January 27, 2025, 7:11am

Hello,
Thanks for the response!
So, you’re saying it’s too big of a task for this specific board?

About the swap file, I don’t have a SSD attached at the moment and I currently have 3.9 swap memory. But I’ve noticed that the board enters the swap memory after RAM getting full, and it eventually crashes if used swap is too large. Therefore, I am not sure if just adding more swap memory would be the solution.

Also, I have already installed jtop. As for gnome system monitor, I am connecting to the board remotely, via ssh.

carbunescu_maeva · January 27, 2025, 7:23am

Hi,
Thanks for responding!

Yes, the RAM usage is going up after the first frame inference.
I have also tried running with the Ultralytics pre-trained model (exported it to .engine format), and I have tested on a mp4 video containing cars and pedestrians. (I did not find any video samples from Ultralytics, they only have some images, and in their documentation, they are just testing on some videos on youtube, so I’ve figured any video would do). The inference was correct, but the behaviour was the same - RAM used memory constantly increasing.
I have tried checking the app using memory_profiler, and I’ve got these results, but I am not sure how to interpret them. They are listed below:

Can you tell me how am I supposed to efficiently check for memory leakage? And how am I supposed to fix it? I am only using the scripts from the Ultralytics library, nothing more.

foxsquirrel1 · January 27, 2025, 4:25pm

You don’t.

You only have 2 choices, one is move to a metal box with a RTX or better GPU. Another is too patch your board buy going to a fast nvme with a heatsink and bump your swap up to 60-100 GB on that NVMe. If you need pure performance buy a GPU and main board and build it your self.

AastaLLL · January 28, 2025, 7:32pm

Hi,

You can check it with Valgrind.
Or our compute-sanitizer can also check memory leakage:

/usr/local/cuda/bin/compute-sanitizer -h
...
Memcheck-specific options:
  --check-cache-control                 Check cache control memory accesses.
  --detect-missing-module-unload        Detect leaks caused by missing module unload calls. This option should not be used if the application uses the CUDA runtime.
  --leak-check arg (=no)                <full|no> Print leak information for CUDA allocations.
  --padding arg (=0)                    Size in bytes for padding buffer to add after each allocation.
  --report-api-errors arg (=explicit)   Print errors if any API call fails.
                                        all      : Report all CUDA API errors, including APIs invoked implicitly
                                        explicit : Report errors in explicit CUDA API calls only
                                        no       : Disable reporting of CUDA API errors
  --track-stream-ordered-races arg (=no)
                                        Track CUDA stream-ordered allocations races.
                                        all              : Track and report all CUDA stream-ordered allocations races
                                        use-before-alloc : Track and report use-before-alloc CUDA stream-ordered allocations races
                                        use-after-free   : Track and report use-after-free CUDA stream-ordered allocations races
                                        no               : Disable tracking and reporting for CUDA stream-ordered allocations races

Thanks.

carbunescu_maeva · January 29, 2025, 5:50am

I see. I wanted to find the cheapest solution while obtaining fast good real-time inference time for running such a program, and this board seemed like the solution to it.
Thanks for the suggestions!

carbunescu_maeva · January 29, 2025, 7:03am

Hello,

Thanks for responding!
After testing some more, I have discovered that

the used RAM memory actually stops increasing after (around) 3.2GB,
then the Buffer/Cache memory is going up to 4GB,
while the free one is being depleted (under 1GB).
then it starts using the swap memory, which ultimately leads to crashing.
it took 3 hours and 37 minutes for the board to start using the swap memory then I stopped it.

Therefore I have used this command: sync; echo 3 > /proc/sys/vm/drop_caches to empty the cache memory every other minute. The program run for more than 20h straight. I am not sure is the best solution to the problem but I am considering it.

AastaLLL · January 30, 2025, 7:08pm

Hi,

It is a possible WAR.
Is that good for your use case already?

About the issue, is the stream_inference the same as the inference?
It looks like the memory increases after calling * stream_inference()* (maybe load the model).
But no change when calling inference.

Thanks.

carbunescu_maeva · January 31, 2025, 7:03am

Hello,

I have decided to run it again without emptying the cache memory at all, and see how long it takes to crash after starting using the swap memory. It entered in swap after 1h45m but it’s been running now for 48h straight. I’ve noticed that both the used and free ram memory have gone up a bit.
The stats right now look like this:

So, I am waiting to see if it’s gonna crash at all. I think the first time it crashed may have been in the beginning when I did not know about the Stream=True parameter (my bad), but the used swap memory is rising, nonetheless.

About the stream_inference() vs the inference() functions, I can’t test right now (as the program is still running), but I am pretty sure I’ve tested the program on videos (without the stream=True parameter), and the behaviour was the same.

Thank you!

AastaLLL · February 2, 2025, 5:18pm

Hi,

Thanks for the update.
Please let us know if the crash happens again.

system · March 12, 2025, 2:26am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How to optimize the usage on jetson's RAM Jetson Orin Nano jetson-inference	5	773	April 11, 2024
"Out of Memory" prompt when detecting in real time Jetson Orin Nano cuda , jetson-inference , python , jetson	5	190	July 17, 2024
Question about the memory usage of jeston nanoo Jetson Orin Nano cuda , jetson-inference , python	3	58	October 17, 2024
TensorFlow object detection inference out of memory Jetson Nano	7	3067	October 18, 2021
Getting stuck while running Jetson Nano Tensort inference Jetson Nano	3	606	September 4, 2019
Yolov7 training issues on jetson nano..memory full Jetson Nano yolo	4	560	June 7, 2023
Device memory is insufficient for Jetson example Jetson Nano jetson-inference	3	1363	March 23, 2022
Free up more RAM for Ollama (Jetson Orin Nano Super) Jetson Orin Nano generative_ai	7	282	May 21, 2025
Getting low memory problem while running object detction model on jetson nano Jetson Nano jetson-inference	3	777	October 15, 2021
JetPack 6.0 (Rev 2) Kernel Memory Leak Jetson Orin Nano cuda , kernel , ubuntu , jetson , deepstream	4	233	October 7, 2024

Jetson Orin Nano's RAM keeps getting full, the board crashes

Related topics