How to avoid issues after kernel upgrade

Petri-X · January 23, 2025, 10:30am

Hi,
We have new servers with the GPU cards (I wish this is correct forum), but last time our RedHat servers got the kernel upgrade I lost the servers. The solution was to return to the previous kernel version.

This most likely means that we have done some installations incorrectly, as I would like to believe there should not be such an issues.

This is what dkms status gives:
nvidia/565.57.01, 5.14.0-503.15.1.el9_5.x86_64, x86_64: installed

The kernel what arrived to the system was:
kernel-5.14.0-503.19.1.el9_5.x86_64

And when looking for the console it had the following on screen:
1.440287] i804Z: Can’t read CTR while initializing i804Z
1.615545] Kernel panic — not syncing: VFS: Unable to mount root fs on unknown-block(0,0)
1.615853] CPU: 33 PID: 1 Com: swapper/0 Not tainted 5.14.0-503.19.1.el9_5.x86_64 WI
1.616147] Hardware name: HPE ProLiant DL385 Gen11/ProLiant DL305 Gen11, BIOS 1.58 01/04/2024

I tried to read the documentation if there has been some steps which I have missed, but could not be sure.

Should the kernel updates be possible to do without issues?

shelter · January 23, 2025, 10:42am

Not an answer to the issue but in my opinion, it’s up to the distribution maintainers to make sure their provided packages work with their kernels, especially when it comes to Redhat.

Petri-X · January 23, 2025, 11:33am

Hi Shelter,
I’m not sure what you mean :)
Kind of expecting that the regular kernel upgrade should not breaking the system.

leigh123linux · January 23, 2025, 1:43pm

Why should I support Redhat’s bastardised kernel for rpmfusion nvidia packages.
Maybe I would reconsider this if I was paid for my time.

shelter · January 23, 2025, 1:46pm

Oh sorry, I thought the packages was provided by Redhat, apologize.

leigh123linux · January 23, 2025, 1:47pm

Complain to Redhat, their kernel is non-standard, they back-port support from the 6.xx kernel into their 5.xx kernel.

Petri-X · January 23, 2025, 6:05pm

Oh, thanks friends for your feedback. If I read you correctly, the kernel update must be manual operation always to keep the GPU drivers in healthy state?

Topic		Replies	Views
Nvidia Kernel 6 Linux	1	1613	November 22, 2022
What's the process for fixing NVIDIA drivers after kernel updates in Ubuntu 20.04 Linux	18	49979	April 18, 2024
NVIDIA-Linux-x86_64-340.104.run caused problem in RHEL 7.4 Linux	11	2645	October 20, 2017
Kernel panic when Xorg is starting with nvidia drivers versioned ~460.67 - 460.80 on GeForce GTX 1060 with kernels 5.10+ Linux kernel , linux , gpu	2	1608	July 14, 2021
Kernel module was not rebuilt when updated kernel was installed on RHEL6 Linux	0	2770	June 9, 2014
RHEL 7.7 + 430.52 : random kernel crashes Linux	12	2404	October 3, 2019
Dnf update fc33.x86_64 fails with dkms install nvidia on GeForce GTX 560M Linux kernel	3	1702	October 12, 2021
[BUG ON] while using xfwm4 and nvidia-drm.modeset=1 Linux	5	1796	December 3, 2018
OS still using Renoir after installing nvidia drivers Ubuntu 18.04 RTX 3060 AMD64 ryzen 7 Linux ubuntu , nvidia-smi	9	3085	August 11, 2021
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running Linux nvidia-smi	19	5041	November 16, 2023

How to avoid issues after kernel upgrade

Related topics