Jetson TX2 OpenMP: threads working on 5 cores, not 6 cores

Hi,

I am testing some OpenCV algorithm by using OpenMP and CUDA on Jetson TX2 recently. When I test the CPU performance by using OpenMP, I found there are only 5 cores (4 from Cortex A57 and 1 from Nvidia) are running, One of the cores from Nvidia is always sleeping. I am not sure how could this happen? How can I make all the cores working?

The picture of Nsight Profile System:
https://docs.google.com/drawings/d/1z3dNYD-CIXuoAsp3bJuHzOQ80pqq2_1DDV33iIhAcjY/edit?usp=sharing
The testing code is simple:

#pragma omp parallel for
for(int i=0;i<h;i++)
        #pragma omp parallel for
        for(int j=0;j<w;j++)
            V.at<float>(i,j) = pow(a,hz.at<float>(i,j));

Thanks in advance.

Hi,

Have you enabled the denver core with nvpmodel first?

sudo nvpmodel -m 0
sudo ~/jetson_clocks.sh

Thanks.

Hi AastaLLL,

Thank you for letting me know about this nvpmodel. After I set it as the 0 model (2.0 G CPU and 1.3 G GPU), I saw an obvious speedup. However, One of the cores from Denver still doesn’t have work to do. I updated the picture:

https://docs.google.com/drawings/d/1z3dNYD-CIXuoAsp3bJuHzOQ80pqq2_1DDV33iIhAcjY/edit?usp=sharing

Are there any examples available to demo this? I can try some other examples.

Thanks!

Hi,

Sorry for the late reply.

The CPU2(Denver) is working but just not concurrent.
This is related to the multi-thread implementation. For example, it may be used as the master.

Thanks