I support several software developers - all of them have dual or three head workstations running either NVIDIA 315, NVIDIA 510, or NVIDIA 620 video cards. Ever since we upgraded to CentOS 7 from CentOS 6, we have always had to reinstall the NVIDIA video driver after any yum updates that included a new kernel. Re-installing the video driver just became a given, and an expected last step every time I did a yum update that included a new kernel.
Ever since the CentOS 3.10.0-1160.15.2.el7.x86_64 kernel came out, when I go to re-install the video driver, maybe 20% of the time, re-installing the video driver really goes south - it spawns a couple thousand âshâ processes, and it maxes out the RAM and CPU until the machine starts really crawling and eventually crashes.
To get the machine back to a working state I have to drop it down to run level three, re-enable the nouveau video driver, delete /etc/X11/xorg.conf, and reboot. I tried to uninstall the NVIDIA driver using the --uninstall switch, but any running of that driver kills the machine again.
After I can boot into X windows with the CentOS nouveau driver, I can then reinstall the NVIDIA video driver once again. Any customizations the user had in their video settings will be lost and theyâll have to start from scratch again.
These are the video drivers Iâm using for each card:
315 card NVIDIA-Linux-x86_64-390.138.run
510 card NVIDIA-Linux-x86_64-410.73.run
620 card NVIDIA-Linux-x86_64-450.66.run
Any suggestions how to make this process less painful?
Thanks, PG
It would be preferable to get the NVIDIA drivers to work correctly. These systems are on a network that isnât connected to the internet and external repos make things a lot more complicated than it should be.
Do you have any tricks to get the NVIDIA drivers to work better?
I tried rpmfusion on a test system that was connected to the internet and could use the rpmfusion repos, and the installation worked fine. That particular system had kernels going back to 3.10.0-1127, plus the most recent 3.10.0-1160 kernels.
When I tried it on the production network, which is using the 3.10.0-1160 kernels (nothing prior to that), the install choked with an error about needing a kernel dependency of less than 3.10.0-1128.
Does that imply that rpmfusion video driver is meant for CentOS 7.8 or earlier? (the 3.10.0-1160 is CentOS 7.9) Is there a newer version coming out?
Iâm testing the âThree possibilitiesâ response - the 3rd is kind of over my head - I tried the other two but had sketchy results. I want to try a few more times before claiming success or failure. If I got this to be more reliable than whatâs been happening, that in itself would be a success of sorts.
Looking at the rpmfusion repo contents, it looks like they indeed dropped support for centos 7 which is a pity.
Though if youâre getting âsketchy resultsâ with only compiling the kernel modules, makes me wonder whatâs going on in detail.
One common denominator Iâm seeing between the updates that fail and the ones that work, the workstations that were updated from CentOS 7.8 to 7.9 seem likely to fail (ârpmquery kernelâ shows both 3.10.0-1127 and 3.10.0-1160 kernels), and the ones where the initial install was 7.9 seem to work fine. I just tried a different tactic on an updated 7.8 - I removed the NVIDIA driver before doing yum update, and reinstalled the driver after finishing and that one went fine. Not sure if thatâs a one off or a trend. Have to try some more and see how it goes.
Ok, hereâs what should be the final update.
DKMS was not installed on any of these machines before. I tried installing DKMS on some test machines and then doing a yum update, but they still crapped out.
I tried running the NVIDIA driver with the --dkms switch before running yum update and that seemed to do the trick. The machines I did that on came up without the need to reinstall the video driver after a yum update.
So, it looks like the key is that the video driver has to be run with the --dkms switch prior to doing an update - then all is well.
Thanks generix for your help - you gave very useful suggestions, and it worked out in the end.
Ok, seems it wasnât clear, the procedure is
- install dkms
- run the runfile installer with --dkms option to register with dkms
- on any subsequent kernel update, the driver get autocompiled (no further driver installation necessary)