Cuda Installation Issues, Ubuntu 16.04, GeForce GTX 860M

Hi All,

I’ve been really banging my head trying to install Cuda on my machine. As far as I can tell I"ve correctly followed the instructions for package installer based installation (as described in http://developer.download.nvidia.com/compute/cuda/8.0/secure/prod/docs/sidebar/CUDA_Installation_Guide_Linux.pdf?autho=1485047016_a6443b99e51f060c2816c2c9607f4430&file=CUDA_Installation_Guide_Linux.pdf )

However when I try to use the Samples (e.g NVIDIA_CUDA-8.0_Samples) I get several errors/warnings:

(from standard error)

nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
/usr/bin/ld: warning: libGLX.so.0, needed by /usr/lib/nvidia-367/libGL.so, not found (try using -rpath or -rpath-link)
/usr/bin/ld: warning: libGLdispatch.so.0, needed by /usr/lib/nvidia-367/libGL.so, not found (try using -rpath or -rpath-link)
/usr/lib/nvidia-367/libGL.so: undefined reference to `__glDispatchRegisterStubCallbacks'
/usr/lib/nvidia-367/libGL.so: undefined reference to `__glXGLLoadGLXFunction'
/usr/lib/nvidia-367/libGL.so: undefined reference to `_glapi_Current'
/usr/lib/nvidia-367/libGL.so: undefined reference to `__glDispatchFini'
/usr/lib/nvidia-367/libGL.so: undefined reference to `_glapi_get_current'
/usr/lib/nvidia-367/libGL.so: undefined reference to `__glDispatchUnregisterStubCallbacks'
/usr/lib/nvidia-367/libGL.so: undefined reference to `__glDispatchInit'
collect2: error: ld returned 1 exit status
make[1]: *** [simpleTexture3D] Error 1
make: *** [2_Graphics/simpleTexture3D/Makefile.ph_build] Error 2

(also this error message in standard out)

make[1]: Entering directory '/home/daniel/Src/cudann/NVIDIA_CUDA-8.0_Samples/2_Graphics/simpleTexture3D'
/usr/local/cuda-8.0/bin/nvcc -ccbin g++   -m64      -gencode arch=compute_20,code=sm_20 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_60,code=compute_60 -o simpleTexture3D simpleTexture3D.o simpleTexture3D_kernel.o  -L/usr/lib/"nvidia-367" -lGL -lGLU -lX11 -lglut
Makefile:270: recipe for target 'simpleTexture3D' failed

When I try to run deviceQuery anyways I get the following error:

./bin/x86_64/linux/release/deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount returned 35
-> CUDA driver version is insufficient for CUDA runtime version
Result = FAIL

Any help would be much appreciated!

Your GPU driver is not correctly installed. This is giving rise to both the deviceQuery error. The linking error with libGL/libGLX may be due to this or it may be due to not having correct libraries installed to compile and run the sample codes that depend on cuda/GL interop (which simpleTexture3D does).

Properly installing a GPU driver requires careful attention to the provided instructions, and also depends on the history of the machine. If you’ve previously had GPU drivers or CUDA installed on your machine, you must follow the relevant sections in the install guide to clean up previous installs.

I don’t know the history of your machine and there’s not enough info in your question to guess what may be wrong or what you may have done wrong.

Thanks for the help! I figured it out :)

Posted my notes for posterity, like you said it was really a matter of carefully following
the instructions in the guide + making sure my machine had a clean record

  1. Did a clean install of ubuntu 16.04

  2. Disabled secure boot UEFI

    • I think this should either help the process or at least be neutral
  3. Downloaded the cuda toolkit 8.0

using
http://developer.download.nvidia.com/compute/cuda/8.0/secure/prod/docs/sidebar/CUDA_Installation_Guide_Linux.pdf?autho=1485306739_871a36eedd1d1d41c6cae1bacc73cb0a&file=CUDA_Installation_Guide_Linux.pdf

as a guide, will refer to specific steps.

install:

i) installed the headers
sudo apt-get install linux-headers-(uname -r)
#note this did nothing, headers were already up to date

when i do
$ubuntu-drivers list
I see:
intel-microcode
nvidia-367
nvidia-340

However, running
$sudo apt-get purge nvidia-current
$sudo apt-get purge nvidia-367
all return nothing
(says they aren’t installed so nothing to do)

Running
$sudo nvidia-uninstall
also returns nothing (command not found)

Finnally also tried the (deprecated)
./cuda_8.0.44_linux.run --uninstall -silent
which didn’t seem to do anything

At this point I gave up, so perhaps there are additional ways
of removing these drivers that I can’t find

ii) diable nouveau

  • created /etc/modprobe.d/blacklist-nouveau.conf as per 4.3.5 (in the linux guide)
    $ sudo update-initramfs -u

iii) boot to text mode
Following http://ubuntuhandbook.org/index.php/2014/01/boot-into-text-console-ubuntu-linux-14-04/
Modified grub to boot to text
-restarted computer
-noticed that things booted into graphic mode anyway :/

-ctrl+alt+f1

$lsmod | grep
nouveau returns nothing, so assuming nouveau is off

#to be extra-sure a few I also ran
$sudo init 3
$sudo service lightdm stop

iv) install
./cuda_8.0.44_linux.run --no-opengl-libs --verbose

v) post-install
*removed the grub startup parameters, asmentioned above

*Noted that there was no /dev/nvidia dir :(

so I ran the script specified in 4.4. (linux install guide) now I see
/dev/{nvidia0, nvidiactl, nvidia-uvm} all exist :)

To be certain its OK i ran
sudo chmod 0666 /dev/nvidia* (as per the guide’s instructions)

*added to path (in .bashrc)
export PATH=/usr/local/cuda-8.0/bin${PATH:+:{PATH}} export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64\ {LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

*did another reboot just to be sure

IT WORKS :) :) :) :)
(confirmed via running deviceDriver && bandwidthTest)

noticed some graphical issues still, so removed the noveau blaclist
sudo rm /etc/modprobe.d/blacklist-nouveau.conf sudo update-initramfs -u

Ah now I have cuda installed succesfully, but my display is all off :(
I was messing with nvidia-xconfig, which I think caused the issue,
so removed /etc/X11/xorg.conf
$sudo rm /etc/X11/xorg.conf

restart
everything now works!