"invalid context" when mixing OpenMP, OpenAcc

AndrewWilson41729 · January 30, 2014, 4:16pm

I’ve implemented the compute-intensive portion of my program in openacc, with a partial speed-up. To utilize both my GPU and my CPUs, I want to do something like this:

! Initialize everything, on both host and GPU
  ...
do i_t = 1,n_t
  !$omp parallel sections

  !$omp section
     ! update arrays on the cpu for this timestep
     call UpdateIntegrationStep()
  !$omp section
     ! update gpu-driven simulation for this timestep
     call Solve_GPU()
  !$omp section
     ! possibly do some disk i/o on the host for the timestep
     if (mod(i_t,10)==0) call SaveStep()
  
  !$omp end parallel sections
  ! synchronize certain subarrays between host and gpu
  !$acc update host(....)
enddo

Without these OMP directives, the code compiles and runs just fine. When I add these directives, I get this error from the thread trying to execute the GPU work:

call to cuModuleLoadData returned error 201:  Invalid context

I have tried messing about with “acc_set_device_num”, but with no luck. Any thoughts on whether what I’m trying is possible? Why it isn’t working?

Platform is:

PGI (Visual) Fortran 14.1 (yes, the new one)
Win 7 64
Dual Xenon Sandy Bridge (12 cores, 24 with HT)
NVIDIA GeForce GTX 650 Ti

MatColgrove · January 31, 2014, 4:19pm

Hi Andrew,

Working with openMP and openACC can be a bit tricky. Each openMP thread will create it’s own context either implicitly when encountering an OpenACC region (data or compute), or explicitly by calling “acc_init”. More importantly, you’ll need to manually decompose the problem ensuring the correct data gets to the correct GPU context. (Note that a host thread can create multiple contexts but a single context can’t be shared by multiple host threads).

Given the code snipit and error, it appears that you’re initializing the GPU outside a parallel region, hence the master thread is creating the GPU context. Then when entering the parallel section, a different thread executes the Solve_GPU routine and then tries to access data from the master thread’s context.

Given that it’s nondeterministic as to which thread will execute a given section, you’ll need to encapsulate all of your GPU usage within the “Solve_GPU” routine (or that section). This way it wouldn’t matter which thread executed it. The drawback would be that you would need to copy the data back and forth after each call.

Hope this helps,
Mat

AndrewWilson41729 · January 31, 2014, 6:08pm

It helps, Mat. Thanks. I think I can see how to accomplish what I wanted with OpenMP/OpenAcc.

It would be something like, every time GPU code is being executed:

!$omp parallel
if (omp_get_thread_num()==) then
  ! do GPU stuff
else
  ! any CPU stuff to be executed in parallel here
endif
!$omp end parallel

At the moment, I’ve offloaded the whole simulation to the GPU, but being able to utilize both GPU and CPU would be…dandy.

Alas, I imagine that it’s time to learn MPI.

Topic		Replies	Views
Combining OpenMP and OpenACC Legacy PGI Compilers	4	6166	November 14, 2017
Error with pinned memory and threads on the host Legacy PGI Compilers	3	3717	August 10, 2017
combine the OpenMP with the OpenACC Legacy PGI Compilers	5	5433	April 22, 2014
OpenMP Offload: additional memory usage on GPU 0 for code running on other GPUs nvc, nvc++ and nvfortran	3	611	May 25, 2023
OpenMP, OpenACC and acc_set_device_num Legacy PGI Compilers	12	10761	March 15, 2013
MultiGPU, multithread, and establishing contexts Odd (but good) behavior with OpenMP affecting multi CUDA Programming and Performance	4	6239	July 10, 2009
Invalid context error with OMP & GPU Legacy PGI Compilers	4	5441	October 15, 2010
Std::thread and OpenMP GPU Offloading Legacy PGI Compilers	2	1102	December 23, 2022
Using an OpenMP thread for GPU traffic Legacy PGI Compilers	2	1476	September 4, 2018
Using multiple GPUs Legacy PGI Compilers	7	22072	August 11, 2009

"invalid context" when mixing OpenMP, OpenAcc

Related topics