cuda fortran does not work on my computer now

I need helps again: my cuda fortran code does not run now. It worked well before. I guess it was because I have made some system updates (Windows 10) or firmware updates (dell precision 7720). I removed and reinstalled the latest CUDA toolkit and PGI fortran compiler (community version) but it did not solve my problem:

The PGI fortran compiler itself works well – it prints out “hello world” correctly, and it compiles the “deviceQuery.cuf” code and runs correctly,

The CUDA toolkit works well – I can run the sample programs in the CUDA toolkit package correctly

It seems that the PGI fortran compiler is not calling CUDA gpu (Quadro P4000) correctly. The message when running my code is:

ALLOCATE: copyin Symbol Memcpy FAILED:13(invalid device symbol)

I also tested with the following sampel CUDA fortran code:

module mathOps
  attributes(global) subroutine saxpy(x, y, a)
    implicit none
    real :: x(:), y(:)
    real, value :: a
    integer :: i, n
    n = size(x)
    i = blockDim%x * (blockIdx%x - 1) + threadIdx%x
    if (i <= n) y(i) = y(i) + a*x(i)
  end subroutine saxpy 
end module mathOps

program testSaxpy
  use mathOps
  use cudafor
  implicit none
  integer, parameter :: N = 40000
  real :: x(N), y(N), a
  real, device :: x_d(N), y_d(N)
  type(dim3) :: grid, tBlock

  tBlock = dim3(256,1,1)
  grid = dim3(ceiling(real(N)/tBlock%x),1,1)

  x = 1.0; y = 2.0; a = 2.0
  x_d = x
  y_d = y
  call saxpy<<<grid, tBlock>>>(x_d, y_d, a)
  y = y_d
  write(*,*) 'Max error: ', maxval(abs(y-4.0))
end program testSaxpy

The result when running this code is:

PGI$ ./saxpy.exe
Max error: 2.000000
Segmentation fault

What might be wrong with my computer? Thanks a lot!

Hi dypang,

I gave you’re program a try with PGI 19.10 on a Windows system with a GTX 1080 which is the closest device I have to your P4000. However I didn’t see any issue nor did cuda-memcheck report any problems.

I removed and reinstalled the latest CUDA toolkit

I’m assuming you just installed CUDA 10.2? It’s possible that this is the problem since CUDA 10.2 just came out a few weeks ago so not yet supported by the 19.10 compilers. Though a program built with CUDA 10.1 should still run correctly with a 10.2 driver.

Try compiling with “-Mcuda=cuda10.1” so the compiler builds with the CUDA 10.1 components that ship with the compilers instead of trying to use the local 10.2 install.


Hi Mat, thanks, I will try you suggestion when I am in my office. But I am not sure it is the solution. initially I was using CUDA10.1+PGI FORTRAN 19.04, and it did not work.

But you reminded me to try downgrading Cuda tools, instead of always upgrading it!

Hi Mat, thanks for your suggestion, it turned out that -Mcuda=cuda10.1 did not work, but -Mcuda=cuda9.2 works

And I also realized that I do not need to remove cuda toolkit 10.2, and I do not need to install cuda toolkit 9.2, I only need tu use -Mcuda=cuda9.2 with cuda toolkit 10.2! That’s very convenient.

Interesting. Once I get a 10.2 CUDA SDK installed on a Windows system and see about trying to reproduce this issue. I’d think using 10.1 would work and you shouldn’t need to fall back to 9.2, but am glad you got things working.


Hi Mat, I now need to run it on my new machine as well, it is a dell precision 7740 with “Quadro RTX 5000” on it.

I installed CUDA toolkit version 10.2 for ubuntu 18.04, and PGI fortran community version 19.10

cuda “deviceQuery” suggest it has:

CUDA Driver Version / Runtime Version 10.2 / 10.2
CUDA Capability Major/Minor version number: 7.5

However, my code does not run on this machine. It seems that GPU part is not functioning correctly. I tried different combinations of X.Y and ?? in “-Mcuda=cudax.y, cc??”, but neither worked.

CUDA toolkit 10.2 + PGI 19.10 works on my older machine with options “-Mcuda=cuda9.2, cc60”, that machine has a QUADRO P4000 and Windows 10 on it.

Is there any rules of selecting these compiler options so that PGI and CUDA toolkit works together correctly?

Hi dypang,

As I noted before, PGI 19.10 doesn’t support CUDA 10.2, nor do I currently have the ability to test your configuration. So unfortunately, don’t know what’s wrong. Though, I’m assuming you did try using “-Mcuda=cuda10.1,cc75”? The RTX devices (CC 7.5) need at least CUDA 10.1 enabled.