What you are reporting is confusing information initially. If you are crashing while playing games, or crashing while starting your computer, those are two distinct problems.
Looking at dmesg the freeze occurs after systemd starts to load, after the kernel goes through it’s typical startup procedure. Generally when the system freezes on bootup it is because of either the lack of an nvidia driver, which crashes the desktop environment, and it’s startup process, or it’s because the nvidia driver was not installed properly, which can happen for various reasons. Other than that you could have a problem with your hardware itself, which is the first place we should ideally look when troubleshooting any problem.
Here’s what I can identify based on the information
you have provided.
Youre using the Linux version 6.8.1-zen1-1-zen kernel
You’re using EndeavorOS based on the kernel command line, and you have an nvidia driver installed, and you
are using btrfs.
The kernel goes through it’s standard startup procedure
There are some errors I’ve never seen before reported,
and that have to do with nvidia
root@pc:~/Desktop# cat show3x_dmesg | grep error
[ 0.968456] pcieport 0000:00:1c.2: DPC: error containment capabilities: Int Msg #0, RPExt+ PoisonedTLP+ SwTrigger+ RP PIO Log 4, DL_ActiveErr+
[ 7.736815] nvidia-gpu 0000:01:00.3: i2c timeout error e0000000
[ 7.736822] ucsi_ccg: probe of 5-0008 failed with error -110
root@pc:~/Desktop# cat show3x_dmesg | grep fail
[ 3.011052] ACPI: _TZ_.TZ10: _PSL evaluation failure
[ 4.048164] nvidia: module verification failed: signature and/or required key missing - tainting kernel
[ 7.736818] ucsi_ccg 5-0008: i2c_transfer failed -110
[ 7.736820] ucsi_ccg 5-0008: ucsi_ccg_init failed - -110
[ 7.736822] ucsi_ccg: probe of 5-0008 failed with error -110
root@pc:~/Desktop# cat show3x_dmesg | grep warn
root@pc:~/Desktop#
.048150] nvidia: loading out-of-tree module taints kernel. [ 4.048159] nvidia: module license 'NVIDIA' taints kernel. [ 4.048160] Disabling lock debugging due to kernel taint [ 4.048164] nvidia: module verification failed: signature and/or required key missing - tainting kernel [ 4.048165] nvidia: module license taints kernel.
That’s normal, and shouldn’t be a problem unless your using ubuntu’s latest features, and follow it’s security guidelines which prevent unregistered modules from being loaded.
Searching around a bit about some of your error messages, people are confirming that this is related to your hardware as opposed to issues in software.
Thing is, looking at google search results around “gpu has fallen off the bus” people are reporting information related to a hardware issue, but, you apparently installed linux using your nvidia card without issues, and on multiple distro’s, sometimes getting to do some gaming.
If you could provide more information about what hardware you are using, how you go about your basic installation, and anything else you can think of related to your overall setup, that would be helpful.
The standard workaround, to get past a crashed boot process and an nvidia card + innappropriate linux configuration (which is what typically crashes the boot process) is to append nomodeset to the kernel command line.
Either that, or, you can append the number 3 to your kernels command line, with a systemd based init system which tells systemd to boot into runlevel 3 instead of a graphical desktop environment. There you will have a terminal to interact with your system, where you can attempt to fix the problem as well.
Afterwards you are going to want to switch to the x11 display server, and if your problem still exists, then you revert to a different kernel, or reinstall the driver properly, checking it’s installation every step of the way to make sure there are no errors reported.
edit:
It could be the case that you have a problem, unrelated to nvidia actually, but without more information we have no way of knowing, barring literal experience with these issues.
In your last quoted section it states that the nvidia kernel module is unloaded, and an nvidia crash dump has been created. You can try to recreate the crash and grab the log to upload it here, or you can try to properly configure your system initially, and see if you can get it working.
kernel: NVRM: A GPU crash dump has been created. NVRM: nvidia-bug-report.sh as root to col> NVRM: the NVIDIA kernel module is unloade>