I am testing some OpenCV algorithm by using OpenMP and CUDA on Jetson TX2 recently. When I test the CPU performance by using OpenMP, I found there are only 5 cores (4 from Cortex A57 and 1 from Nvidia) are running, One of the cores from Nvidia is always sleeping. I am not sure how could this happen? How can I make all the cores working?
The picture of Nsight Profile System：
The testing code is simple:
#pragma omp parallel for for(int i=0;i<h;i++) #pragma omp parallel for for(int j=0;j<w;j++) V.at<float>(i,j) = pow(a,hz.at<float>(i,j));
Thanks in advance.