CUDA 9.0 install on Ubuntu 18.04 "unmet dependencies"

Just tried to install CUDA 9.0 (needed for TensorFlow) on a Dell 9550 laptop / GTX 960 with the latest Ubuntu 18.04 and the 390 driver. (Software & Updates > Additional Drivers)

sudo dpkg -i cuda-repo-ubuntu1704-9-0-local_9.0.176-1_amd64.deb
sudo dpkg -i <3patchfiles>
sudo apt update
sudo apt install cuda
“…
The following packages have unmet dependencies:
cuda : Depends: cuda-9-0 (>= 9.0.176) but it is not going to be installed
E: Unable to correct problems, you have held broken packages.”

The same worked on a NUC6i7KYK with a Thunderbolt 3 HP Omen eGPU 1070 Ti (needs the newer 4.15 Kernel of 18.04) and a manually installed 390 driver,
That’s why I actually went for 18.04 on the Dell too - to have both machines on the same versions, but now the error.

Solution? Thanks
G.

Upon further investigation:
The CUDA files fro dpkg -i end up in /var and not in /usr as seen and suggested somewhere else.
And I see the 384.81 driver referenced. Do I need to downgrade to it?

CUDA 9.2 .deb install just killed the Ubuntu 18.04 installation.

Errors were encountered while processing:
/tmp/apt-dpkg-install-ukZmzx/099-nvidia-396_396.26-0ubuntu1_amd64.deb
E: Sub-process /usr/bin/dpkg returned an error code (1)

390.48 is / was installed.

Setup > Additional drivers showed 396, 390, nouveau and “manually installed driver”, but did not let change it.
Reboot failed.

ubuntu 18.04 is not an officially supported platform for CUDA 9.0, or CUDA 9.2

https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html

Yes, @txbob, thanks, I can read.

Edit1: @txbob, toned down: From decades of experience: More often than not things work though “not supported”.
Actually it turns out supported 17.10 chokes as well with at least two Canonical snake-pit idiosyncrasies:
#1 took me “just” an hour or two, not calculating the additional installs and configs: With the eGPU off I get to the login. Under the password in the cogwheel switch to “Ubuntu on Xorg”
#2 hang w/ eGPU on “started gnome display manager and dealing with any system changes” still waiting for solution (will start new thread)
End edit

It´s “interesting” that there is no official support for the overwhelmingly important Linux MONTHS after release.
Worse, when I wanted to send the bug report it said “It´s not ours. Go bust.”

On the Dell the Ubuntu 18.04 is just a trial, but on the NUC6i7KYK with the Thunderbolt 3 HP Omen Accelerator eGPU I needed a more recent Ubuntu / Kernel for the TB3 eGPU to get recognized. I could have started to compile a new Kernel for Ubuntu 16.04 or try 18.04.

Edit2
One should think that things get better / easier with technological advances, but when I look at the NUC BIOS it’s intimidating. You never know enough.
So regarding the Kernel supporting Thunderbolt 3, after days of reading up and weighing the merit of the posts I learned that there is sth like Ubuntu LTS or Hardware Enablement Kernels which let me stay on 16.04 with the minimum 4.13 Kernel for TB3 in 16.04.4.
Could have avoided the callow / immature Snappy and Wayland BS, etc. and the step back to 17.10 (Wayland default trap) because 18.04 (Xorg default) is not yet supported for CUDA.
Edit2 end

So it was the old story again: Help yourself to get help.
Had to jump through TWO hoops at once. Decided to dig into GRUB AND boot repair.

Always had wondered why some of my Linux installations didn´t show a boot menu and occassionally went on my *ss by stating that timeout = 0 is no longer supported (or so). Go figure.

So as remedy I wanted to edit /etc/default/grub, but as this was not the active system and I couldn´t run update-grub on it afterwards I had to go for /boot/grub/grub.cfg directly. As the installation was ***** up anyway it couldn´t get worse, so I went for it and commented out several IFs around timeout=30.
And voilà, there was it, the boot menu!

Now Advanced Options > Recovery > dpkg Repair.
And wow, it revived the system with the 396.26 driver!

Finally solved with extreme persistence!

After failures I went back from Ubuntu 18.04 (chosen as it’s the new LTS and it has the most recent Kernel 4.15 - to be on the safe side for eGPU support; problems with Wayland and Snappy)…
…to 17.10.1 with CUDA 9.2 support and the minimum 4.13 Kernel quoted supporting Thunderbolt 3…
…to 16.04.4 having learned it has a HWE (HardWare Enablement) Kernel 4.13 as well - hopefully sufficient for the eGPU.

And 16.04.4 gave me the deciding hint when it showed a series of boot screens reporting graphics problems.
The menu choices didn’t help, but a look at the proposed System log showed that the eGPU had been blocked and there was a new setting “AllowExternalGpus”. (seemingly showing in the April 2018 396.18 driver for the first time, now at .26 – phew, cutting edge!)

With some more Googling I found this belonged in /etc/X11/xorg.conf, which actually didn’t exist, only a file with an additional date extension. (xorg.conf.07062018) Maybe I screwed things up trying the options in the boot phase.

Section "Device"
    Identifier "nvidia"
    Driver "nvidia"
    BusID "PCI:9@0:0:0"
    Option "ConstrainCursor" "off"
    Option "AllowExternalGpus" "true"
EndSection

Edit
And I put Option “AllowExternalGpus” “true” in the Section “Screen” as well – as the docs didn’t tell exactly which section to use. (Too late to try tonite! Just get this out.)

Coupla days later: Just tried on Ubuntu 17.10: AllowExternalGpus in Section “Device” is sufficient!

Also: Apart from https://github.com/intel/thunderbolt-software-user-space I came across https://gitlab.freedesktop.org/bolt/bolt which deals with TB3 security and may be worth a look.
Edit end

Hope some have fun with slim notebooks and eGPUs and nvidia-docker2 NGC and Deep Learning now.
Cheers
G.

I fix this issue only with this: sudo apt-get -o Dpkg::Options::="--force-overwrite" install --fix-broken