Multiple GPUs error cuLaunchKernel 400

I am testing the example for multiple GPUs on page 16 from PGI Accel. Compilers OpenACC Getting Started Guide v.13.2. Using 1 OpenMP thread and 1 GPU works fine, but for 2 GPUs with 2 OpenMP threads I get the error message cuLaunchKernel 400: Invalid handle. The machine has 2 M2070 GPUs, the latest nVidia driver for OracleLinux 6.3 and 13.2 PGI CDK.

i have the same problem :S

Sorry about that. This is a known issue that has been fixed in the 13.3 release (just released yesterday March 12th on Linux, Windows to follow here shortly).

% setenv OMP_NUM_THREADS 2
% pgf90 -acc multi.f90 -mp -lacml -V13.2 ; a.out
 Host Serial    2489.612915315794     
call to cuLaunchKernel returned error 400: Invalid handle
call to cuMemcpyDtoHAsync returned error 4: Deinitialized
% pgf90 -acc multi.f90 -mp -lacml -V13.3 ; a.out
 Host Serial    2489.612915315794     
 Multi-Device Parallel    2489.612915315794
  • Mat

Hi Mat, as you told us, this issue is solved in ver 13.3 for our code too.

Thank you,

Hi, I have the same problem using pgi 14.3 on my windows machine. This machine has 4 GeForce GTX780Ti. Using 1 OpenMP thread for 1 GPU works fine, but when i am trying to use 2 OpenMP threads each for one GPU I get this error. Here is the code snippet:

#pragma omp parallel num_threads(2)
		int i, j, k;
		int id, blocks, start, end;
		id = omp_get_thread_num();
		blocks = n/threads;
		start = id*blocks;
		end = (id+1)*blocks;
		acc_set_device_num(id+2, acc_device_nvidia);

		printf("copying %d\n", id);
		#pragma acc data copyin(aa[start*n:blocks*n])\
		printf("kernel %d\n", id);
			#pragma acc kernels loop collapse(2) private(j,k)
			for(i=start; i<end; i++)
				for(j=0; j<n; j++)
					float c = 0.0f;
					for(k=0; k<n; k++)
						c += aa[i*n+k] * bb[k*n+j];
					cc[i*n+j] = c;
		printf("after kernel %d\n", id);

And the output:

copying 0
copying 1
kernel 1
kernel 0
call to cuLaunchKernel returned error 400: Invalid handle

My compilation command:

pgcc -acc -mp -V14.3 -Minfo=accel -fast multi.c


Thanks miki_zizou. I’ve recreated the error here, filed a problem report (TPR#20174), and sent it off to engineer for further investigation.

Best Regards,

Is there an solution available?

I encountered this Problem in PGI 17.5 too.

Hi IngoSchulz85971,

TPR#20174 was fixed awhile ago and I doubled checked that the error miki_zizou was seeing does not occur with 17.5. Hence while the error may be the same, it’s cause is different.

Can you please post or send to PGI Customer Service ( a reproducing example as well as more information about your environment, such as OS, target devices, compilation flags, and details on how to run the program.


Hi, I sent a reproducing example and I am now waiting for response :)