Parallel processing capacity for CNN algorithms in Jetson orion board

Hi I have Jetson Orion AGX 32GB industrial board with jet pack version 5.1 .I am working with CNN based Deep learning algorithm in python . I am running two algorithms parallely the execution speed is slow but when i am checking the GPU usage it is not fully utilized help me with this issue. Is there any chance of parallel processing those algorithms to achieve faster execution.
Algorithms used 1. YOLO v8 2. UNET (both image segmentation).

Hi,

Usually, this comes from the IO bottleneck.
Could you try to run a single inference to see if it can use 99% GPU resources?

Below is a resnet-based UNet with Deepstream for your reference:

Thanks.

No when i try to run single inference also I cannot use the 99% of the GPU resources. I have tested on other semantic segmentation algorithms also.I need to use 99 % percent of GPU and memory handling when i try to use two or more algorithms in single instance .

Thanks.

Hi,

Which frameworks do you use?
Is it possible that the GPU is waiting for some data to execute?

Would you mind profiling your pipeline with Nsight System to find where the bottleneck comes from?

Thanks.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.