Hi developers!
Recently I noticed something strange when I am running a python script for inference with my custom YOLO v4-tiny model with the cv2.dnn.readNetFromDarknet(). The program works just fine but at low FPS, as I see in other videos in Youtube, this is normal while detecting objects with YOLO.
At the time I checked the JTOP monitor(exclusive developed app for the Jetson Nano) while running my program it gave me the following results:
From what I think is happening is that the program is only using the four cores of the CPU instead of the GPU.
Just take a look in the GPU tab of JTOP:
It seems that CUDA cores are barely working. And in the CPU tab is a whole different story, all cores working with 95 % in average:
Two thing may be happening: Or JTOP is not trustworthy or GPU is lazing around letting the CPU take all the work. I have compiled OpenCV to work with CUDA:
The BIS parameter is just the blob image size that I can change in real time from 32x13=416 to 32
I’ve compiled my OpenCV with this guide: Install OpenCV 4.5 on Jetson Nano - Q-engineering
Is there something else that I need to install or write down in my python script in order to use OpenCV CUDA accelerated or I am already using it?
Thank you in advance for any reply