GPUs mandate physical RAM, and so even if you added swap, this would probably still fail. Swap can help since it will allow some normal user space applications to swap out, and thus leave more physical RAM for the GPU, but the return on this is limited. Basically, you must use less RAM.
The power model won’t help with this, but if you are using more than one thread, then limiting the number of threads will help. If you use a certain number of CUDA cores, then using fewer would help. I’m not particularly good with the AI end, so I couldn’t tell you how to specifically go about this, but basically this where you would start.
@linuxdev@klyuan1986 Thanks for the response, But this dint occur in my previous jetson nx. I even tried your solution @klyuan1986, but i use a python script to run the tensorflow trt inference.
The cmd waits for around 5 mins then, the out of memory issue is raised top right corner.
Google colab worked fine, which had lesser RAM size than jetson.
May I know more about the failure scenario (2 min later)?
Does it occurs a while after inference?
If yes, it sounds like there is a memory leakage in the implementation.
In general, the memory usage should be stable when inference for each frame.
Since both buffer and engine is re-used in the inference time.
Would you mind to check if there is any leakage in your code first?
By the way, please also check the command shared below also: