Hey guys,
Was wondering if there’s any documentation for proper openmp usage with cuda. I wish to compute for example:
#pragma omp parallel for
for (int n=0; n<ngpu; n++)
{
cudaSetDevice(n);
CODE
}
CODE
#pragma omp parallel for
for (int n=0; n<ngpu; n++)
{
cudaSetDevice(n);
CODE
}
I’m wondering if this is possible within cuda. I’m not sure exactly how contexts work, but I imagine this could cause problems since thread are created at the start and destroyed at the end of each loop. Can someone advise? :)