I am using CPU -GPU to do combined coding. Cuda and c++. One node have 4GPU cards and 24 CPU processors or more. The current code has CPU part and GPU part. One thread associated with each GPU. So I use 4 CPU threads because I have 4 GPU cards. My question is can I use the left 20 threads with OpenMP (suppose I use one thread each processor) to do for loop parallelization in CPU code part ?
Yes, you can use the CPU threads however you see fit.