On TX2, When I run a CUDA program (using GPU to process something), the CPU utilization is almost 100%. According to NVIDIA documents, I use cudaSetDeviceFlags(cudaDeviceScheduleBlockingSync) in the initialization of the program, but it is useless (while this method is useful on X86 platform). How to reduce a CUDA program’s CPU utilization? Thanks.
Could you profile your program with nvprof first?
This will help us figure out the major tasks of CPU and be able to give a further suggestion.
sudo ./nvprof -o data.nvvp [your program]
I solve this issue after installing the newest jetpack, thanks.