I have Ubuntu 20.04 installed on an old WD Passport external hard drive of mine. The actual machine I’m attempting to run it on is an MSI Pulse 15 B13V laptop, which has an NVidia RTX 4070 and some unknown Intel VGA controller. I am attempting to install NVidia drivers so I can do some CUDA development work.
Attempting to install 550 and 545 provided by the graphics-drivers ppa from the Ubuntu Software & Updates application was not successful. 550 caused a kernel panic (something about a null pointer dereference) that prevented booting even into recovery mode. Thankfully I could still chmod into it from install media and uninstall the faulty driver. As for 545, some form of build error prevented it from installing completely. This leaves me with 535.
Thankfully 535 manages to install without causing any sort of massive kernel issues. However, it does prevent me from using my GUI which is going to be a problem. Running the nvidia-smi command does return information about my GPU, so it would appear there’s no issue with the driver (though that might be an erroneous assumption on my part). Looking through the logs, there appears to be some sort of problem with Xserver. More specifically, I’m receiving an error “Cannot run in framebuffer mode. Please specify busIDs for all framebuffer devices.” I’ve been doing some searching and trying many things but cannot seem to resolve this problem to have both the NVidia driver and GUI running.
So my major question is: how can I get some version of the NVidia drivers running while still having access to a GUI? Attached are the bug report and Xorg log while driver 535 was installed.
Well, I believe I may have solved at least part of the issue. It turns out that kernel version I mentioned at the end there was very much relevant. I updated my kernel version to 6.9.9-060909-generic, and at the very least installing the 550 driver did not cause the kernel panic I experienced earlier.
The next issue was that the Nvidia driver would not actually start up. I figured that this was due to the integrated Intel graphics card, so I switched to discrete rendering via the MSI Center application on Windows.
Now I’ve reached a point where the login screen shows, but it repeats over and over. I switch to a different tty, and in the journalctl I find modprobe: FATAL: Module Nvidia not found in directory /lib/modules/6.9.9-060909-generic. A bit of research indicates that I need to install linux-modules-nvidia-550-6.9.9-060909-generic package, which makes it quite unfortunate that it doesn’t actually exist. Firstly, all the linux modules I can see for nvidia-500 seem to be based on the server driver, if the names are to be believed. But more obviously, I have an invalid kernel version. Interestingly enough, there does seem to be a module for my original kernel, but understandably I’m not in a rush to move back to it.
So, will the server modules for Nvidia driver 550 suffice, and what kernel version is actually recommended?
I opted to switch back to my original kernel and use driver 535. After installing the correct linux modules package, and keeping the discrete rendering mode active, it appears I have solved the issue. I’m going to test things a bit before marking this as a solution. One possible area of concern was during the startup screen. I pressed escape to see the more verbose startup and the screen went completely black for a split second before switching. This doesn’t seem to be a major problem for the moment, though.
Now I’ve reached a point where the login screen shows, but it repeats over and over. I switch to a different tty, and in the journalctl I find modprobe: FATAL: Module Nvidia not found in directory /lib/modules/6.9.9-060909-generic.
About this: You need to search for the DKMS package of nvidia driver. DKMS packages are compatible with almost every Linux kernel you try, but if you are going to use the latest NVIDIA driver, you need to have your kernel updated. I don’t remember which version of the kernel your need to be able to run the modern versions of the nvidia driver but 6.6 and up should work fine.