Cannot install Cuda for my GeForce GTX 1050 Ti on Ubuntu 18.04

I have a fresh install of Ubuntu on my computer with NVIDIA Corporation GP107 [GeForce GTX 1050 Ti] (rev a1). I just want to use my GPU to work with Pytorch but im having trouble getting Cuda installed.

I began by following an online tutorial which told me that if I installed pytorch with anaconda, then i don’t need to install Cuda seperately (https://youtu.be/UWlFM0R_x6I). However, this did not work, following his advice torch.cuda.is_available() still returns False.

Then I looked up further guides and I think the problem is that I don’t have the necessary drivers for my GPU and i lack the cuda toolkit. I think this because on my computer “nvcc --version” returns “command nvcc not found”, which is an indication that I lacked cuda toolkit. So I downloaded cuda toolkit 10 from https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&target_distro=Ubuntu&target_version=1804 and tried torch.cuda.is_available() again but it still returns False.

I repeated the above action several times for toolkit 10,9,8 and it did not work. I also tried to download nvidia drivers from https://www.nvidia.com/Download/index.aspx which required I uninstall some things, I can’t remember what they were but it caused my computer to crash and i had to reinstall ubuntu.

I lack any knowledge of setting up graphics cards for machines so excuse my ignorance on the subject.

For Ubuntu 18.04 you should use CUDA 10 installer only.

You may have a messed up linux config with all the things you’ve done. The easiest cleanup option might be to reload the OS.

Otherwise, follow the instructions in the linux install guide carefully:

https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html

The whole document is important. I suggest reading and understanding the whole thing. For example, if you skip or ignore sections 2.7, or 7.x, or nearly any other section, it may not work.

After installing, if nvidia-smi doesn’t work, nothing else will work.

Before trying to figure out if pytorch will work, try to verify or validate the CUDA install first. There are instructions in the linux install guide.

I followed the pre install. When I was installing the cuda 10 toolkit it said:

nvidia-installer log file '/var/log/nvidia-installer.log'
creation time: Sun Dec  9 08:36:11 2018
installer version: 410.48

PATH: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin

nvidia-installer command line:
    ./nvidia-installer
    --ui=none
    --no-questions
    --accept-license
    --disable-nouveau
    --no-cc-version-check
    --run-nvidia-xconfig

Using built-in stream user interface
-> Detected 6 CPUs online; setting concurrency level to 6.
-> Installing NVIDIA driver version 410.48.
-> Running distribution scripts
   executing: '/usr/lib/nvidia/pre-install'...
-> done.
-> The distribution-provided pre-install script failed!  Are you sure you want to continue? (Answer: Continue installation)
ERROR: The Nouveau kernel driver is currently in use by your system.  This driver is incompatible with the NVIDIA driver, and must be disabled before proceeding.  Please consult the NVIDIA driver README and your Linux distribution's documentation for details on how to correctly disable the Nouveau kernel driver.
WARNING: One or more modprobe configuration files to disable Nouveau are already present at: /etc/modprobe.d/nvidia-installer-disable-nouveau.conf.  Please be sure you have rebooted your system since these files were written.  If you have rebooted, then Nouveau may be enabled for other reasons, such as being included in the system initial ramdisk or in your X configuration file.  Please consult the NVIDIA driver README and your Linux distribution's documentation for details on how to correctly disable the Nouveau kernel driver.
-> For some distributions, Nouveau can be disabled by adding a file in the modprobe configuration directory.  Would you like nvidia-installer to attempt to create this modprobe file for you? (Answer: Yes)
-> One or more modprobe configuration files to disable Nouveau have been written.  For some distributions, this may be sufficient to disable Nouveau; other distributions may require modification of the initial ramdisk.  Please reboot your system and attempt NVIDIA driver installation again.  Note if you later wish to reenable Nouveau, you will need to delete these files: /etc/modprobe.d/nvidia-installer-disable-nouveau.conf
ERROR: Installation has failed.  Please see the file '/var/log/nvidia-installer.log' for details.  You may find suggestions on fixing installation problems in the README available on the Linux driver download page at www.nvidia.com.

I came across this error before and trying to disable Nouveau crashed my duel boot in the past and I had to reinstall ubuntu.

I guess you haven’t followed the instructions in the linux install guide I linked to disable Nouveau.

You’ll need to disable nouveau to use the runfile installer method. The instructions are given. If you don’t want to use them for some reason, then you could also try the package manager install method. That does not require the same disable Nouveau step.

All of this information is contained in the linux install guide I linked.

thank you for being patient with me. you’re absolutely right I haven’t been following your instructions. I’ll carefully go through it now.

I tried to follow the guide and torch.cuda.is_available() is still returning False.

Fristly, I followed all the pre installation steps. I also disabled Nouveau display drivers and used Nvidia drivers from nvidia-driver-390. Then I ran the cuda 10 local installation, installing the toolkit and the sample. Then i did the environment setup as part of post-installation. Then I installed anaconda for python3.7 and did

conda install pytorch torchvision cuda100 -c pytorch

running torch.cuda.is_available() with the above setup is still returning False. From there I also tried:

sudo apt install nvidia-cuda-toolkit gcc-6

nvcc --version

as suggested in https://askubuntu.com/questions/1028830/how-do-i-install-cuda-on-ubuntu-18-04 but it still isn’t working.

running nvcc -V is returning:

nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Nov__3_21:07:56_CDT_2017
Cuda compilation tools, release 9.1, V9.1.85

is it a problem that this is 9.1 instead of 10?

Also, I now have all the nvidia programs like Nsight eclipse, Nvidia visual profiler,
Nvidia X server settings which I don’t know the purpose of.

What can I do now?

CUDA 10 will not work with those drivers.

I have tried installing the driver as part of the package from nvidia.com/getcuda however it will say:

Installing the NVIDIA display driver...
The driver installation has failed due to an unknown error. Please consult the driver installation log located at /var/log/nvidia-installer.log.

===========
= Summary =
===========

Driver:   Installation Failed
Toolkit:  Installation skipped
Samples:  Installation skipped


Logfile is /tmp/cuda_install_26728.log

Which is why I tried to install the driver another way using software and updates -> Additional Drivers. Then I can get the toolkit and samples.

I GOT IT TO WORK!!! the trick was that I needed to use the deb install instead of local. Thank you so much for your help