pgfortran works for cuda but not OpenACC

Hello PGI Gurus,

My laptop (Ubuntu 14.04, Geforce GTX 960M) currently has the CUDA 7.5 toolkit with driver version 352.63. CUDA C and C++ appear to be working properly.

I recently installed the OpenACC toolkit to play around with accelerating Fortran code, and with using OpenACC.

pgfortran generates CUDA Fortran executables from .cuf files that appear to work fine. I have verified they are executing on the GPU by running them in the visual profiler.

pgfortran also generates OpenACC executables that work when the host is the target, i.e.

mcarilli:test$ 
mcarilli:test$ pgfortran -acc -ta=host -Minfo=accel -o task3 task3.f90 
mcarilli:test$

generates a functioning host executable (task3).

However, pgfortran generates OpenACC executable that fail. Example of such a failure:

mcarilli:test$ 
mcarilli:test$ 
mcarilli:test$ 
mcarilli:test$ pgfortran -acc -ta=nvidia -Minfo=accel -o task3 task3.f90
main:
     22, Generating copy(a(:,:))
         Generating create(anew(:,:))
     25, Generating present(a(:,:),anew(:,:))
     26, Loop is parallelizable
     27, Loop is parallelizable
         Accelerator kernel generated
         Generating Tesla code
         26, !$acc loop gang ! blockidx%y
         27, !$acc loop gang, vector(128) ! blockidx%x threadidx%x
         30, Max reduction generated for error
     35, Generating present(a(:,:),anew(:,:))
     36, Loop is parallelizable
     37, Loop is parallelizable
         Accelerator kernel generated
         Generating Tesla code
         36, !$acc loop gang ! blockidx%y
         37, !$acc loop gang, vector(128) ! blockidx%x threadidx%x
mcarilli:test$ 
mcarilli:test$ 
mcarilli:test$ 
mcarilli:test$ ./task3
Jacobi relaxation Calculation: 1024 x 1024 mesh
call to cuInit returned error -1: Other
mcarilli:test$ 
mcarilli:test$

Some other information that seems relevant is strange output from pgaccelinfo:

mcarilli:test$ 
mcarilli:test$ 
mcarilli:test$ pgaccelinfo

NVRM version:                  NVIDIA UNIX x86_64 Kernel Module  352.63  Sat Nov  7 21:25:42 PST 2015
No accelerators found.
Try pgaccelinfo -v for more information
mcarilli:test$ 
mcarilli:test$ 
mcarilli:test$ pgaccelinfo -v

NVRM version:                  NVIDIA UNIX x86_64 Kernel Module  352.63  Sat Nov  7 21:25:42 PST 2015
could not initialize CUDA runtime, error code=-1

OpenCL Platform:               NVIDIA CUDA
OpenCL Vendor:                 NVIDIA Corporation

Device Number:                 0
Device Name:                   GeForce GTX 960M
Available:                     Yes
Compiler Available:            Yes
Device Version:                OpenCL 1.2 CUDA
Global Memory Size:            2147352576
Maximum Object Size:           536838144
Global Cache Size:             81920
Max Clock (MHz):               1176
Compute Units:                 5
Constant Memory Size:          65536
Local Memory Size:             49152
Workgroup Size:                1024
Address Bits:                  64
ECC Support:                   No
libcoi_host.so not found
mcarilli:test$ 
mcarilli:test$ 
mcarilli:test$

My LD_LIBRARY_PATH is

mcarilli:test$ 
mcarilli:test$ echo $LD_LIBRARY_PATH
/usr/local/cuda/lib64/:/usr/local/cuda/lib64/stubs/:
mcarilli:test$

Any ideas on why this is happening? Why would pgfortran succeed for Cuda Fortran code, but produce OpenACC executables that fail to run on the device? Why would pgfortran succeed for Cuda Fortran code at all if “no accelerators are found?”

Please let me know if you require any additional information to diagnose the issue.

Much appreciated,
Michael

Hi Michael,

When I’ve seen this error it’s because the runtime can find the OpenCL runtime library but not the driver’s libcuda.so library. Can you check if “/usr/lib64/libcuda.so” exists?

  • Mat

Hi Mat,

Thanks so much for your help.

Within /usr, libcuda.so apparently resides in two places:

mcarilli:usr$ find /usr/ -name libcuda.so
/usr/local/cuda-7.5/targets/x86_64-linux/lib/stubs/libcuda.so
/usr/lib/i386-linux-gnu/libcuda.so
find: `/usr/share/doc/google-chrome-stable': Permission denied
mcarilli:usr$

I also have a symbolic link to
/usr/local/cuda-7.5/targets/x86_64-linux/lib/stubs/libcuda.so,
namely,
/usr/local/cuda/lib64/stubs .

I tried adding /usr/lib/i386-linux-gnu/ to my LD_LIBRARY_PATH and recompiling, but the same problem occurred.

How can I tell the OpenACC runtime where libcuda.so resides?

Much appreciated,
Michael

Hi Michael,

The “stubs” libraries are just dummies used for linking on systems without the drivers installed so can’t be used here. Also, the 32-bit libraries can’t be used with 64-bit binaries.

Can you try installing the 64-bit CUDA driver?

  • Mat

Hi Mat,

Thanks again for your reply.

Before I start breaking things, I would like to confirm: By “64-bit CUDA driver” do you mean the Nvidia driver for my card (http://www.nvidia.com/download/driverResults.aspx/95159/en-us)?

For the record, my current installed driver is the following:

mcarilli:~$ nvidia-smi
Tue Jan 12 10:11:53 2016       
+------------------------------------------------------+                       
| NVIDIA-SMI 352.63     Driver Version: 352.63         |                       
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 960M    Off  | 0000:01:00.0     Off |                  N/A |
| N/A   46C    P8    N/A /  N/A |    212MiB /  2047MiB |      5%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0      1199    G   /usr/bin/X                                     135MiB |
|    0      2959    G   compiz                                          65MiB |
|    0      4549    G   ...s-passed-by-fd --v8-snapshot-passed-by-fd     2MiB |
+-----------------------------------------------------------------------------+
mcarilli:~$

Also, as I mentioned, both CUDA C and CUDA Fortran with pgfortran work fine. Finally, the Nvidia GPU is currently driving my X window display. Would installing a fresh driver break any of those?

Sorry for the confusion, I just want to make sure you have all information before I proceed.

Regards,
Michael

would like to confirm: By “64-bit CUDA driver” do you mean the Nvidia driver for my card

Yes.

Also, as I mentioned, both CUDA C and CUDA Fortran with pgfortran work fine. Finally, the Nvidia GPU is currently driving my X window display.

Yes, I don’t a have a definitive answer to why these work. Possibly because you’ve compiled in 32-bits?

Would installing a fresh driver break any of those?

I can’t guarantee that it wont since I don’t know your system, but I doubt it.

Keep in mind my recommendations are strictly based on past experience with other users. They are educated guesses, but still guesses.

  • Mat

Hi Mat,

I fixed the problem without having to reinstall the drivers. Basically, it turned out libcuda.so WAS installed on my computer, albeit in an unexpected location and with a different name. I had already attempted to look for “libcuda.so” on “/” and found only the 32-bit library and stubs mentioned earlier.

Buried in another forum post for a somewhat-similar problem (https://devtalk.nvidia.com/default/topic/521395/libcuda-so-and-libamdcalcl-so-missing/ post #7) it was mentioned that libcuda.so is conventionally a symlink to “libcuda.so.major.minor” e.g., “libcuda.so.352.63” for driver version 352.63.

I searched for “libcuda.so.352.63” instead:

mcarilli:/$ find / -name libcuda.so.352.63 2>/dev/null
/usr/lib/i386-linux-gnu/libcuda.so.352.63
/usr/lib/x86_64-linux-gnu/libcuda.so.352.63
mcarilli:/$

The 64-bit version of libcuda.so.352.63 resided in /usr/lib/x86_64-linux-gnu/. In this same directory there was also a symlink to “libcuda.so.352.63” called “libcuda.so.1.”

I created a new symlink to “libcuda.so.352.63” called “libcuda.so,” then added “/usr/lib/x86_64-linux-gnu/” to my LD_LIBRARY_PATH:

mcarilli:/$ echo $LD_LIBRARY_PATH 
/usr/lib/x86_64-linux-gnu/:/usr/local/cuda/lib64/:
mcarilli:/$

Adding to LD_LIBRARY_PATH may have been unnecessary; perhaps /usr/lib/x86_64-linux-gnu/ is searched by default. In any case pgaccelinfo now returns the expected output:

mcarilli:~$ pgaccelinfo

CUDA Driver Version:           7050
NVRM version:                  NVIDIA UNIX x86_64 Kernel Module  352.63  Sat Nov  7 21:25:42 PST 2015

Device Number:                 0
Device Name:                   GeForce GTX 960M
...etc...

I can now compile executables with -ta=tesla:cuda-7.5 that successfully execute on the device. I’m sure they’re executing on the device because I’m getting roughly 10x speedup over executables compiled with -ta=host, so everything is working as hoped.

Your help definitely got me on the right track (namely, conducting a thorough search for “libcuda.so”).

Much appreciated,
Michael

You’re welcome Michael. I’m glad you figured it out. I’ll file this away as another possible cause for this issue if another user encounters it.

  • Mat