How to avoid issues after kernel upgrade

Hi,
We have new servers with the GPU cards (I wish this is correct forum), but last time our RedHat servers got the kernel upgrade I lost the servers. The solution was to return to the previous kernel version.

This most likely means that we have done some installations incorrectly, as I would like to believe there should not be such an issues.

This is what dkms status gives:
nvidia/565.57.01, 5.14.0-503.15.1.el9_5.x86_64, x86_64: installed

The kernel what arrived to the system was:
kernel-5.14.0-503.19.1.el9_5.x86_64

And when looking for the console it had the following on screen:
1.440287] i804Z: Can’t read CTR while initializing i804Z
1.615545] Kernel panic — not syncing: VFS: Unable to mount root fs on unknown-block(0,0)
1.615853] CPU: 33 PID: 1 Com: swapper/0 Not tainted 5.14.0-503.19.1.el9_5.x86_64 WI
1.616147] Hardware name: HPE ProLiant DL385 Gen11/ProLiant DL305 Gen11, BIOS 1.58 01/04/2024

I tried to read the documentation if there has been some steps which I have missed, but could not be sure.

Should the kernel updates be possible to do without issues?

Not an answer to the issue but in my opinion, it’s up to the distribution maintainers to make sure their provided packages work with their kernels, especially when it comes to Redhat.

Hi Shelter,
I’m not sure what you mean :)
Kind of expecting that the regular kernel upgrade should not breaking the system.

Why should I support Redhat’s bastardised kernel for rpmfusion nvidia packages.
Maybe I would reconsider this if I was paid for my time.

Oh sorry, I thought the packages was provided by Redhat, apologize.

Complain to Redhat, their kernel is non-standard, they back-port support from the 6.xx kernel into their 5.xx kernel.

1 Like

Oh, thanks friends for your feedback. If I read you correctly, the kernel update must be manual operation always to keep the GPU drivers in healthy state?