I am trying to develop a program which uses multiple GPU’s independently with CUDA.
I was wondering whether I could activate multiple devices within a
parallel OpenMP for loop. The basic structure of the program in mind is :
#pragma omp parallel for
for (device = 0; device < deviceCount; device++){
cudaSetDevice(device);
// do GPU-intensive work
}
I would appreciate if you could give me an idea whether this is possible or not.