Hello, I am attempting to install the CUDA toolkit on an Ubuntu 14.04 server with multiple gpus. I have reached step 4.4 in the Installation Guide (“Device Node Verification”) but have become stuck. When I check /dev/nvidia* there are no entries that correspond to the needed device files and I seem to be unable to generate them manually by following the steps outlined in that section.
What I have tried so far:
I placed a file named .nvidia-device-files-gen.sh in my home directory on the machine, inside this file is the script that is provided in section 4.4 of the Install Guide (link: Installation Guide Linux :: CUDA Toolkit Documentation). I then created an Upstart job at /etc/init/nvidia-device-files-gen.conf containing the following lines:
description “multi-line explanation of what I’m doing…”
start on startup
task
exec /home/tom/.nvidia-device-files-gen.sh
However, upon rebooting the system there are still no device file entries for my gpus at /dev/ and when I attempt to manually run the script I get the following:
user@cuda:/dev$ source ~/.nvidia-device-files-gen.sh
mknod: ‘/dev/nvidia0’: Permission denied
mknod: ‘/dev/nvidia1’: Permission denied
mknod: ‘/dev/nvidia2’: Permission denied
mknod: ‘/dev/nvidia3’: Permission denied
mknod: ‘/dev/nvidiactl’: Permission denied
modprobe: ERROR: could not insert ‘nvidia_uvm’: Operation not permitted
At this point I am pretty much at a loss.
I am new to Ubuntu and CUDA so please forgive me if I’m doing something silly, but I am really at a loss and I don’t want to try to proceed with the install process in the guide without successfully seeing these device files showing up. Any help that anyone can offer me with this will be greatly appreciated.
That script would have to be run as root. that is why you are getting the permission issues.
Before you get to the point of trying to put this script in place, you might want to just verify that the GPUs are accessible and working correctly. Have you done that? It might be a simple as running the deviceQuery sample code as root, or even just run nvidia-smi as root.
Thank you for the reply. I can see that you have the correct idea because when I run nvidia-smi as root I get the following output:
user@cuda:~$ sudo nvidia-smi
[sudo] password for user:
Thu Jun 30 14:54:07 2016
±-----------------------------------------------------+
| NVIDIA-SMI 352.39 Driver Version: 352.39 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla C2050 Off | 0000:01:00.0 Off | 0 |
| 30% 40C P0 N/A / N/A | 6MiB / 2687MiB | 0% Default |
±------------------------------±---------------------±---------------------+
| 1 GeForce GT 430 Off | 0000:02:00.0 N/A | N/A |
| 52% 38C P0 N/A / N/A | 3MiB / 1022MiB | N/A Default |
±------------------------------±---------------------±---------------------+
| 2 GeForce GTX 470 Off | 0000:05:00.0 N/A | N/A |
| 40% 34C P0 N/A / N/A | 4MiB / 1279MiB | N/A Default |
±------------------------------±---------------------±---------------------+
| 3 GeForce GTX 470 Off | 0000:06:00.0 N/A | N/A |
| 40% 31C P0 N/A / N/A | 4MiB / 1279MiB | N/A Default |
±------------------------------±---------------------±---------------------+
±----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 1 Not Supported |
| 2 Not Supported |
| 3 Not Supported |
±----------------------------------------------------------------------------+
I take this output to mean that the GPUs are not accessible / working correctly. I am pretty sure that I followed the install instructions to the “t” up to this section, but I suppose that I must have made some mistake. I installed via the runfile installer. Off of the top of your head does it seem like I have done anything obviously wrong? Thanks again.
I have resolved the issue. Apparently this was really as simple as me just not being familiar with the system. In case it helps anyone in the future who is experiencing what I went through in my original post, all that needed to be done was for me to generate a xorg.conf file, and then the device files at /dev/ appeared for each of my graphics cards. Nothing exotic needed to be done, I just didn’t have X installed at all since the machine I am using has Ubuntu 14.04 server edition on it.
See the top answer to this post for details:
http://askubuntu.com/questions/4662/where-is-the-x-org-config-file-how-do-i-configure-x-there
It’s true that configuring xorg.conf to use those GPUs will instantiate the device files, but it’s not the only way to do it. And there was nothing wrong with your nvidia-smi output.
Thank you for the information, I’ve looked into the nvidia-smi output more and now I understand what you mean. I was originally focusing on the “Not Supported” messages at the bottom but they aren’t what I initially thought they were.