call to cuInit returned error -1: Other

_Nils · January 11, 2016, 5:40pm

Hello,

I am trying to accelerate my code using OpenACC. I did not receive any compilation errors, however when running the code I receive the error “call to cuInit returned error -1: Other” when calling the first OpenACC directive.

In a similar forum topic it was mentioned that this might be due to the driver attempting to run the code on Linux on a “/dev/nvidia[0…n]” device that is not available. It was then suggested run pgaccelinfo as root and then switching back to the regular user.
(The forum topic I am referring to is What is "call to cuInit returned error 999: Unknown&quo )

Running pgaccelinfo as root yields

CUDA Driver Version:           6050
NVRM version:                  NVIDIA UNIX x86_64 Kernel Module  340.93  Wed Aug 19 16:49:15 PDT 2015

Device Number:                 0
Device Name:                   Tesla K40c
Device Revision Number:        3.5
Global Memory Size:            12079136768
Number of Multiprocessors:     15
Number of SP Cores:            2880
Number of DP Cores:            960
Concurrent Copy and Execution: Yes
Total Constant Memory:         65536
Total Shared Memory per Block: 49152
Registers per Block:           65536
Warp Size:                     32
Maximum Threads per Block:     1024
Maximum Block Dimensions:      1024, 1024, 64
Maximum Grid Dimensions:       2147483647 x 65535 x 65535
Maximum Memory Pitch:          2147483647B
Texture Alignment:             512B
Clock Rate:                    745 MHz
Execution Timeout:             No
Integrated Device:             No
Can Map Host Memory:           Yes
Compute Mode:                  default
Concurrent Kernels:            Yes
ECC Enabled:                   Yes
Memory Clock Rate:             3004 MHz
Memory Bus Width:              384 bits
L2 Cache Size:                 1572864 bytes
Max Threads Per SMP:           2048
Async Engines:                 2
Unified Addressing:            Yes
Managed Memory:                Yes
PGI Compiler Option:           -ta=tesla:cc35

though when switching back to a regular user and running pgaccelinfo again results in

NVRM version:                  NVIDIA UNIX x86_64 Kernel Module  340.93  Wed Aug 19 16:49:15 PDT 2015
No accelerators found.

and when running the code as a regular user I still receive the same error.

It turns out I can run my program as root with the accelerator, though I would like to be able to run my code as a regular user, too.

If anyone has an idea to what is happening and what I can do to omit changing to root in order to run my program, that will be highly appreciated.

Thanks and regards,
Nils

PS: I am using Ubuntu 14.04 and PGI 15.7

MatColgrove · January 12, 2016, 1:25am

Hi Nils,

When I see this error it’s because the CUDA runtime driver, /usr/lib64/libcuda.so, isn’t installed. But in this case, I’m wondering if the permissions are incorrect and only root has access to it.

What are the permissions on “/usr/lib64/libcuda.so”?

Mat

_Nils · January 14, 2016, 4:02pm

Hi Mat,

Thank you very much for your quick reply!
I was looking for libcuda.so but didn’t find it under the path you stated. After asking our admin about it he told me that there are indeed three different CUDA versions installed and he recently was modifying some things in the version that were set in my ambient variables. Changing the ambient variables to the newest CUDA version installed did the trick for me.

Thanks again for your input,

Nils