Black screen in Ubuntu 18 even after purging Nvidia and installing drivers from repository

libglvnd0 is not part of the nvidia driver but an essential part of the OpenGL driver stack. So the whole desktop and graphics stack got purged. Please try just
sudo apt-get install --reinstall ubuntu-desktop

1 Like

So I do that and then reboot, and everything else weā€™ve fixed should remain normal? (Xorg and TeamViewer configurations).

I have several conda environments set up and can run various software packages to analyze data fine -itā€™s just visualizing the results that has become an ordeal, having to transfer files back and forth.

Will ā€œsudo apt-get install --reinstall ubuntu-desktopā€ reset or delete all software I have installed on the machine or does it entail any other risks? (If yes, I can try after important meetings I have on Tuesday/Wednesday, if not, I can try immediately).

By uninstalling libglvnd already all applications using opengl have been uninstalled. Your home directory should be untouched so all things you installed using anaconda (if it resides in your home directory) should still be there. Installing ubuntu-desktop should not remove anything, just check the summary when apt asks you to proceed.

1 Like

You can also use the --dry-run option for apt to check first whatā€™s going to happen.

1 Like

The conda environments are set up within my home directory, but thereā€™s one piece of software that I did install with sudo so that itā€™s available system wide (this one is easy to reinstall though).

TeamViewer is also installed with sudo so that itā€™s available to all users (occasionally other people such as collaborators or visiting scholars need guest access to my machine), but I guess thatā€™s easy to reinstall too, particularly now that I know that Iā€™ll have to redo the step to setup the workstation as a ā€œtrustedā€ deviceā€¦

Ok, Iā€™ll give dry-run a try.

Done, TeamViewer works as normal again!
THANK YOU SO MUCH!
Glad the hellish conditions didnā€™t go beyond10 days.
(I wonder whether I should upgrade from Ubuntu 18 to Ubuntu 20 when Iā€™m back in town in Januaryā€¦).

Also, not directly related to the upgrade of Nvidia drivers, but youā€™re so knowledgeable that I wonder whether you might also be able to figure out why the offending workstation (letā€™s call it, workstation ā€œAā€) doesnā€™t accept remote SSH connections directly from my MacBook (or my partnerā€™s MacBook).
I can SSH fine from within other workstations at my work institution, e.g., workstations ā€œBā€, ā€œCā€, and ā€œDā€, which I can access from my MacBook without issuesā€¦

The problem seems to have started when I installed CiscoAnyConnect on my laptop, and even though Iā€™ve removed it, and supposedly turned off all firewalls (here and on the remote workstation ā€œAā€), SSH still times out.
[I never installed CiscoAnyConnect on my partnerā€™s Mac, so not sure why I canā€™t connect from there eitherā€¦; Iā€™ve tried using both laptops from multiple internet connections in different cities, to no avail].

Hard to say something about the network issue, ip4/ipv6, different subnet, router involved, checked iptables output?

$ sudo iptables -t nat -L
Chain PREROUTING (policy ACCEPT)
target prot opt source destination
DOCKER all ā€“ anywhere anywhere ADDRTYPE match dst-type LOCAL

Chain INPUT (policy ACCEPT)
target prot opt source destination

Chain OUTPUT (policy ACCEPT)
target prot opt source destination
DOCKER all ā€“ anywhere !localhost/8 ADDRTYPE match dst-type LOCAL

Chain POSTROUTING (policy ACCEPT)
target prot opt source destination
MASQUERADE all ā€“ 172.17.0.0/16 anywhere

Chain DOCKER (2 references)
target prot opt source destination
RETURN all ā€“ anywhere anywhere

.

The ā€œlocalā€ stuff seems problematic I guess? Definitely looks different from the output on a workstation that I can access directly! (That one doesnā€™t have ā€œlocalā€ anywhere in the output; no idea how it got changed when SSH stopped working on the offending machine). This is what a machine that I can access directly outputs:

.

$ sudo iptables -t nat -L
Chain PREROUTING (policy ACCEPT)
target prot opt source destination

Chain INPUT (policy ACCEPT)
target prot opt source destination

Chain OUTPUT (policy ACCEPT)
target prot opt source destination

Chain POSTROUTING (policy ACCEPT)
target prot opt source destination

.

Below is what ChatGPT suggests as a possible fix to make the first configuration (in offending workstation) match the second (workstation I can access directly), but not sure whether to trust it (afraid to make things worse). Is it safe to run?:

.

Clear Docker-related rules in the PREROUTING chain

sudo iptables -t nat -D PREROUTING -j DOCKER

Clear Docker-related rules in the OUTPUT chain

sudo iptables -t nat -D OUTPUT -m addrtype --dst-type LOCAL ! -d 127.0.0.0/8 -j DOCKER

Clear Docker-related rules in the POSTROUTING chain

sudo iptables -t nat -D POSTROUTING -s 172.17.0.0/16 -j MASQUERADE

Remove the DOCKER chain

sudo iptables -t nat -F DOCKER
sudo iptables -t nat -X DOCKER

All those rules are the standard docker networking rules, it shouldnā€™t have any influence. Donā€™t delete them, otherwise some docker containers running on that system might fail.

Seeing only the nat table wouldnā€™t be enough to judge what might deny network traffic.
The contents of the filter table would be very much needed for that.
iptables -S

Yeah, I tried Docker years agoā€¦ but that workstation is 99% for my personal use; guests donā€™t have sudo privileges on it, so if Iā€™m not using Docker, nobody is.
Doesnā€™t seem like clearing it would help though, per your response. Thank you for the feedback! Anything else I should check?

Hi Mart, hereā€™s the output of iptables -S:

$ sudo iptables -S
-P INPUT ACCEPT
-P FORWARD DROP
-P OUTPUT ACCEPT
-N DOCKER
-N DOCKER-ISOLATION-STAGE-1
-N DOCKER-ISOLATION-STAGE-2
-N DOCKER-USER
-A FORWARD -j DOCKER-USER
-A FORWARD -j DOCKER-ISOLATION-STAGE-1
-A FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -o docker0 -j DOCKER
-A FORWARD -i docker0 ! -o docker0 -j ACCEPT
-A FORWARD -i docker0 -o docker0 -j ACCEPT
-A DOCKER-ISOLATION-STAGE-1 -i docker0 ! -o docker0 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -j RETURN
-A DOCKER-ISOLATION-STAGE-2 -o docker0 -j DROP
-A DOCKER-ISOLATION-STAGE-2 -j RETURN
-A DOCKER-USER -j RETURN

Nothing there preventing in and outgoing traffic. So itā€™s not that.

You could use traceroute to check if the expected routes are taken.

Yeah, I think I did that a few months ago when the problem first started; no issues that I could tell (but I have no experience with this so maybe I missed/misinterpreted something). Hereā€™s the output on running it again just now:

$ traceroute 134.79.26.28
traceroute to 134.79.26.28 (134.79.26.28), 64 hops max, 52 byte packets
1 192.168.4.1 (192.168.4.1) 12.411 ms 6.386 ms 6.339 ms
2 10.37.0.1 (10.37.0.1) 13.319 ms 14.746 ms 14.116 ms
3 100.127.77.36 (100.127.77.36) 19.738 ms 17.730 ms 26.454 ms
4 100.120.100.36 (100.120.100.36) 24.580 ms 17.877 ms 16.897 ms
5 * * *
6 * * *
7 * * e0-78.core3.sjc2.he.net (66.220.12.165) 37.355 ms
8 * * *
9 lawrence-berkeley-national-laboratory.e0-1.switch1.sjc2.he.net (66.220.6.250) 37.411 ms 38.708 ms 36.653 ms
10 sunn-cr6--slac50s-bb-c.igp.es.net (134.55.57.145) 38.703 ms 41.749 ms 37.845 ms
11 slac50s-cr6--slac50n-bb-a.igp.es.net (134.55.56.38) 37.977 ms 35.125 ms 40.040 ms
12 slac50n-cr6--slac50s-bb-c.igp.es.net (134.55.57.143) 41.604 ms 40.560 ms
slac50n-cr6--slac50s-bb-b.igp.es.net (134.55.57.141) 38.736 ms
13 siteā€“slac-se58.ip (198.129.33.22) 39.068 ms 37.490 ms 41.500 ms
14 * * *
15 * * *
16 * * *
17 * * *
18 * * *
19 * * *
20 cryoem-jgalaz.slac.stanford.edu (134.79.26.28) 38.779 ms 36.082 ms 38.801 ms

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.