Jetson TX2 OpenMP: threads working on 5 cores, not 6 cores


I am testing some OpenCV algorithm by using OpenMP and CUDA on Jetson TX2 recently. When I test the CPU performance by using OpenMP, I found there are only 5 cores (4 from Cortex A57 and 1 from Nvidia) are running, One of the cores from Nvidia is always sleeping. I am not sure how could this happen? How can I make all the cores working?

The picture of Nsight Profile System:

The testing code is simple:

#pragma omp parallel for
for(int i=0;i<h;i++)
        #pragma omp parallel for
        for(int j=0;j<w;j++)
  <float>(i,j) = pow(a,<float>(i,j));

Thanks in advance.


Have you enabled the denver core with nvpmodel first?

sudo nvpmodel -m 0
sudo ~/


Hi AastaLLL,

Thank you for letting me know about this nvpmodel. After I set it as the 0 model (2.0 G CPU and 1.3 G GPU), I saw an obvious speedup. However, One of the cores from Denver still doesn’t have work to do. I updated the picture:

Are there any examples available to demo this? I can try some other examples.



Sorry for the late reply.

The CPU2(Denver) is working but just not concurrent.
This is related to the multi-thread implementation. For example, it may be used as the master.