We have a research cluster and just added a new compute node into it that has a GPU so we can start providing GPU capabilities to our users. The server is an IBM iDataPlex running Red Hat linux 5.5 (64-bit). When I run lspci it shows:
[root@node46 ~]# lspci | grep -i nvidia
19:00.0 3D controller: nVidia Corporation GF100 [Tesla S2050] (rev a3)
19:00.1 Audio device: nVidia Corporation GF100 High Definition Audio Controller (rev a1)
1a:00.0 3D controller: nVidia Corporation GF100 [Tesla S2050] (rev a3)
1a:00.1 Audio device: nVidia Corporation GF100 High Definition Audio Controller (rev a1)
So I downloaded and installed NVIDIA-Linux-x86_64-270.41.34.run, cudatoolkit_4.0.17_linux_64_rhel5.5.run, and gpucomputingsdk_4.0.17_linux.run. After installing each of those and adding the appropriate paths to my PATH & LD_LIBRARY_PATH I rebooted the server to ensure everything is kosher. lsmod shows that the nvidia driver is installed:
[root@node46 ~]# lsmod | grep -i nvidia
nvidia 10765936 0
i2c_core 57537 3 i2c_ec,i2c_i801,nvidia
At this point I went into the NVIDIA_GPU_Computing_SDK/C directory and did a “make x86_64=1” to build all the samples. I then went into /NVIDIA_GPU_Computing_SDK/C/bin/linux/release and tried to run matrixMul as the SDK documentation suggested. However when I try running it I get:
[brucep@node46:~/release] ./matrixMul
[matrixMul] starting...
[ matrixMul ]
./matrixMul Starting (CUDA and CUBLAS tests)...
matrixMul.cu(83) : cudaSafeCall() Runtime API error 38: no CUDA-capable device is detected.
On a whim I decided to try the same thing as root and it appears to have run fine:
[root@node46 release]# ./matrixMul
[matrixMul] starting...
[ matrixMul ]
./matrixMul Starting (CUDA and CUBLAS tests)...
Device 0: "Tesla M2050" with Compute 2.0 capability
Using Matrix Sizes: A(640 x 960), B(640 x 640), C(640 x 960)
Runing Kernels...
> CUBLAS Throughput = 426.0426 GFlop/s, Time = 0.00185 s, Size = 786432000 Ops
> CUDA matrixMul Throughput = 187.0409 GFlop/s, Time = 0.00420 s, Size = 786432000 Ops, NumDevsUsed = 1, Workgroup = 1024
Comparing GPU results with Host computation...
Comparing CUBLAS & Host results
CUBLAS compares OK
Comparing CUDA matrixMul & Host results
CUDA matrixMul compares OK
[matrixMul] test results...
PASSED
Press ENTER to exit...
After I’ve successfully run it as root then the non-privileged account is able to run the app as well:
[brucep@node46:~/release] ./matrixMul
[matrixMul] starting...
[ matrixMul ]
./matrixMul Starting (CUDA and CUBLAS tests)...
Device 0: "Tesla M2050" with Compute 2.0 capability
Using Matrix Sizes: A(640 x 960), B(640 x 640), C(640 x 960)
...
I’ve verified that this is easily reproducible. If I reboot the server then non-privileged accounts will get an error that no CUDA devices are detected until the root account runs a CUDA app. After that it appears that non-privileged users are able to run CUDA apps without any problems.
What’s the reason for this? How can I get around it? I don’t want to have to set up something to run a CUDA job as root just to let other people run their own jobs…
-Bruce