Hello PGI Gurus,
My laptop (Ubuntu 14.04, Geforce GTX 960M) currently has the CUDA 7.5 toolkit with driver version 352.63. CUDA C and C++ appear to be working properly.
I recently installed the OpenACC toolkit to play around with accelerating Fortran code, and with using OpenACC.
pgfortran generates CUDA Fortran executables from .cuf files that appear to work fine. I have verified they are executing on the GPU by running them in the visual profiler.
pgfortran also generates OpenACC executables that work when the host is the target, i.e.
mcarilli:test$
mcarilli:test$ pgfortran -acc -ta=host -Minfo=accel -o task3 task3.f90
mcarilli:test$
generates a functioning host executable (task3).
However, pgfortran generates OpenACC executable that fail. Example of such a failure:
mcarilli:test$
mcarilli:test$
mcarilli:test$
mcarilli:test$ pgfortran -acc -ta=nvidia -Minfo=accel -o task3 task3.f90
main:
22, Generating copy(a(:,:))
Generating create(anew(:,:))
25, Generating present(a(:,:),anew(:,:))
26, Loop is parallelizable
27, Loop is parallelizable
Accelerator kernel generated
Generating Tesla code
26, !$acc loop gang ! blockidx%y
27, !$acc loop gang, vector(128) ! blockidx%x threadidx%x
30, Max reduction generated for error
35, Generating present(a(:,:),anew(:,:))
36, Loop is parallelizable
37, Loop is parallelizable
Accelerator kernel generated
Generating Tesla code
36, !$acc loop gang ! blockidx%y
37, !$acc loop gang, vector(128) ! blockidx%x threadidx%x
mcarilli:test$
mcarilli:test$
mcarilli:test$
mcarilli:test$ ./task3
Jacobi relaxation Calculation: 1024 x 1024 mesh
call to cuInit returned error -1: Other
mcarilli:test$
mcarilli:test$
Some other information that seems relevant is strange output from pgaccelinfo:
mcarilli:test$
mcarilli:test$
mcarilli:test$ pgaccelinfo
NVRM version: NVIDIA UNIX x86_64 Kernel Module 352.63 Sat Nov 7 21:25:42 PST 2015
No accelerators found.
Try pgaccelinfo -v for more information
mcarilli:test$
mcarilli:test$
mcarilli:test$ pgaccelinfo -v
NVRM version: NVIDIA UNIX x86_64 Kernel Module 352.63 Sat Nov 7 21:25:42 PST 2015
could not initialize CUDA runtime, error code=-1
OpenCL Platform: NVIDIA CUDA
OpenCL Vendor: NVIDIA Corporation
Device Number: 0
Device Name: GeForce GTX 960M
Available: Yes
Compiler Available: Yes
Device Version: OpenCL 1.2 CUDA
Global Memory Size: 2147352576
Maximum Object Size: 536838144
Global Cache Size: 81920
Max Clock (MHz): 1176
Compute Units: 5
Constant Memory Size: 65536
Local Memory Size: 49152
Workgroup Size: 1024
Address Bits: 64
ECC Support: No
libcoi_host.so not found
mcarilli:test$
mcarilli:test$
mcarilli:test$
My LD_LIBRARY_PATH is
mcarilli:test$
mcarilli:test$ echo $LD_LIBRARY_PATH
/usr/local/cuda/lib64/:/usr/local/cuda/lib64/stubs/:
mcarilli:test$
Any ideas on why this is happening? Why would pgfortran succeed for Cuda Fortran code, but produce OpenACC executables that fail to run on the device? Why would pgfortran succeed for Cuda Fortran code at all if “no accelerators are found?”
Please let me know if you require any additional information to diagnose the issue.
Much appreciated,
Michael