Low number of cores being used by Jetson TX2


I am doing a face detection project. I am using a Jetson TX2 and OpenCV 3.4.1. I rebuilt OpenCV 3.4.1 from the source with CUDA support. For face detection, I am using a pre-trained DNN model, which I am inferring using the DNN framework included in OpenCV. When my program is running, I have checked the number of cores being using during the face detection (using tegrastats), and I have noticed that just 39 cores are being used (20% of the total of cores). Why does my Jetson is using only 20% of the GPU cores to make the inference? Do I need to set some parameter to increase the number of GPU cores being used? Is it a problem with the DNN framework? If I increase the number of GPU cores being used, can I expect to have a faster detection?



1. Please maximize the device performance first.

sudo nvpmodel -m 0
sudo jetson_clocks

2. If there is no obvious improvement from step.1, your application may be blocked by the I/O process.
It’s recommended to try our implement which has optimized the pipeline on Jetson: