CUDA has to be started as 'root' at least once to work properly

Hello guys!

A friend and I tried to find out why this happens for days now. If I am not running CUDA byte code on my machine as root, I am not able to use the libraries functionallity. Please take a look at the console log I postet below which does exactly show what happens if you command in this exact order on my machine. We did not observe this behavior on my friends machine.

Maybe someone of this forum can help us …

My hardware: http://pastebin.com/GyLNs5B3
CPU: Intel® Core™ i7-3610QM CPU @ 2.30GHz
GPU: GeForce GTX 660M

System:

Linux archlinux 3.13.5-1-ARCH #1 SMP PREEMPT Sun Feb 23 00:25:24 CET 2014 x86_64 GNU/Linux

NVIDIA Driver:

local/bumblebee 3.2.1-3
    NVIDIA Optimus support for Linux through VirtualGL
local/cuda 5.5.22-1
    NVIDIA's GPU programming toolkit
local/lib32-nvidia-libgl 334.21-1
    NVIDIA drivers libraries symlinks (32-bit)
local/lib32-nvidia-utils 334.21-1
    NVIDIA drivers utilities (32-bit)
local/libcl 1.1-3
    OpenCL library and ICD loader from NVIDIA
local/libvdpau 0.7-1
    Nvidia VDPAU library
local/nvidia 334.21-2
    NVIDIA drivers for linux
local/nvidia-utils 334.21-1
    NVIDIA drivers utilities
local/opencl-nvidia 334.21-1
    OpenCL implemention for NVIDIA
local/pycuda-headers 2013.1.1-3
    Python wrapper for Nvidia CUDA
local/python2-pycuda 2013.1.1-3
    Python wrapper for Nvidia CUDA

Console Log:

[snooc@archlinux mergeSort]$ ./mergeSort 
./mergeSort Starting...

CUDA error at ../../common/inc/helper_cuda.h:898 code=38(cudaErrorNoDevice) "cudaGetDeviceCount(&device_count)" 
[snooc@archlinux mergeSort]$ sudo modprobe nvidia-uvm
[snooc@archlinux mergeSort]$ ./mergeSort 
./mergeSort Starting...

CUDA error at ../../common/inc/helper_cuda.h:898 code=30(cudaErrorUnknown) "cudaGetDeviceCount(&device_count)" 
[snooc@archlinux mergeSort]$ sudo modprobe nvidia-uvm 
[snooc@archlinux mergeSort]$ sudo modprobe nvidia
[snooc@archlinux mergeSort]$ ./mergeSort 
./mergeSort Starting...

CUDA error at ../../common/inc/helper_cuda.h:898 code=30(cudaErrorUnknown) "cudaGetDeviceCount(&device_count)" 
[snooc@archlinux mergeSort]$ lsmod | grep nv
nvidia_uvm             25990  0 
nvidia              10736148  1 nvidia_uvm
drm                   239102  4 i915,drm_kms_helper,nvidia
i2c_core               24760  7 drm,i915,i2c_i801,drm_kms_helper,i2c_algo_bit,nvidia,videodev
[snooc@archlinux mergeSort]$ ./mergeSort 
./mergeSort Starting...

CUDA error at ../../common/inc/helper_cuda.h:898 code=30(cudaErrorUnknown) "cudaGetDeviceCount(&device_count)" 
[snooc@archlinux mergeSort]$ sudo ./mergeSort 
./mergeSort Starting...

GPU Device 0: "GeForce GTX 660M" with compute capability 3.0

Allocating and initializing host arrays...

Allocating and initializing CUDA arrays...

Initializing GPU merge sort...
Running GPU merge sort...
Time: 103.632004 ms
Reading back GPU merge sort results...
Inspecting the results...
...inspecting keys array: OK
...inspecting keys and values array: OK
...stability property: stable!
Shutting down...
[snooc@archlinux mergeSort]$ sudo modprobe -r nvidia_uvm 
[snooc@archlinux mergeSort]$ sudo modprobe -r nvidia
[snooc@archlinux mergeSort]$ ./mergeSort 
./mergeSort Starting...

CUDA error at ../../common/inc/helper_cuda.h:898 code=38(cudaErrorNoDevice) "cudaGetDeviceCount(&device_count)" 
[snooc@archlinux mergeSort]$ sudo modprobe nvidia
[snooc@archlinux mergeSort]$ sudo modprobe nvidia-uvm 
[snooc@archlinux mergeSort]$ ./mergeSort 
./mergeSort Starting...

GPU Device 0: "GeForce GTX 660M" with compute capability 3.0

Allocating and initializing host arrays...

Allocating and initializing CUDA arrays...

Initializing GPU merge sort...
Running GPU merge sort...
Time: 103.625000 ms
Reading back GPU merge sort results...
Inspecting the results...
...inspecting keys array: OK
...inspecting keys and values array: OK
...stability property: stable!
Shutting down...

PS: I had to make a second post because the forum wouldn’t allow me to post this all in one. Don’t ask me… server replied “permission denied”.

I believe on another thread someone mentioned that this issue does not occur with CUDA 6.0 RC, if you wish to use the latest driver.

Yeah, the thread: https://devtalk.nvidia.com/default/topic/699610/linux/334-21-driver-returns-999-on-cuinit-cuda-/