Pop!_OS 24.04 freezes during boot with NVIDIA driver (Xid 62, page fault)

I’m trying to isolate a boot freeze on a hybrid Intel + NVIDIA laptop running Pop!_OS.

System

  • Laptop: HP Pavilion cx 144 gaming
  • OS: Pop!_OS 24.04
  • Kernel: 6.17.9-76061709-generic
  • GPUs: Intel UHD 630 + NVIDIA GP107M / GTX 1050 Ti Mobile
  • NVIDIA driver: 470.256.02 (nvidia-dkms-470 / nvidia-driver-470)
  • I am using reFIND → systemd-boot → Pop!_OS

At first I was using the newer nvidia-driver-580, but that was also failing. I asked an LLM and it suggested that since my GTX 1050 Ti Mobile is an older Pascal mobile GPU, the newer NVIDIA path was probably not the best fit on this setup, so it told me to downgrade to 470.

Problem

Pop!_OS boots reliably only when I use a safe Intel-only boot entry that blacklists the NVIDIA modules.

When I boot normally with the proprietary NVIDIA stack enabled, the machine freezes during boot. Like in the image below, the screen was frozen, so I had to force restart.

External Media

The visible splash/progress line is not consistent, so I do not think the last on-screen service name is the real cause.

Also, before installing Pop!_OS, I was on Windows 11, where the NVIDIA GPU was not showing up correctly. After updating the NVIDIA driver, the system crashed and became unstable, and I had to disable the driver from Windows Recovery. Rather than continuing to debug it on Windows 11, I installed Pop!_OS to check whether this is a software issue or a hardware problem, since I expected debugging drivers on Linux to be easier than on Windows.


I created a verbose boot entry with NVIDIA enabled. On that boot, the journal shows:

journalctl -b -1 -k --no-pager \
| grep -Ei 'NVRM: GPU at PCI:0000:01:00|NVRM: Xid \(PCI:0000:01:00\)|BUG: kernel NULL pointer dereference|#PF:|RIP: .*_nv|rm_init_adapter|nvkms_open_gpu|nv_drm_load|nv_drm_probe_devices|nv_linux_drm_init' \
| grep -Evi 'pcieport|aer|alcor|sdcard'
Apr 05 10:05:32 pop-os kernel: NVRM: Xid (PCI:0000:01:00): 62, pid=496, ...
Apr 05 10:05:41 pop-os kernel: #PF: error_code(0x0002) - not-present page
Apr 05 10:05:41 pop-os kernel: RIP: 0010:_nv035204rm+0xac/0x130 [nvidia]
Apr 05 10:05:41 pop-os kernel:  ? rm_init_adapter+0xc5/0xe0 [nvidia]
Apr 05 10:05:41 pop-os kernel:  ? nvkms_open_gpu+0x4e/0x90 [nvidia_modeset]
Apr 05 10:05:41 pop-os kernel:  ? nv_drm_load+0x10d/0x480 [nvidia_drm]
Apr 05 10:05:41 pop-os kernel:  ? nv_drm_probe_devices+0x1eb/0x2c0 [nvidia_drm]
Apr 05 10:05:41 pop-os kernel:  ? nv_linux_drm_init+0xe/0xff0 [nvidia_drm]

I checked the journalctl errors and saw NVIDIA-related lines like rm_init_adapter and nvkms_open_gpu. Based on some online searching, I thought the freeze might be happening when nvidia_drm gets involved, so I tried a second boot entry that disables only that part during boot:

  • nvidia-drm.modeset=0
  • module_blacklist=nvidia_drm
  • modprobe.blacklist=nvidia_drm

That time the machine did not freeze at the same early boot stage, but the desktop was still not healthy. Apps like Chrome and VS Code were not opening properly, and the only way I could get VS Code to run was by forcing software rendering with:

ELECTRON_OZONE_PLATFORM_HINT=x11 LIBGL_ALWAYS_SOFTWARE=1 code --disable-gpu --disable-software-rasterizer --ozone-platform=x11 .

So that seemed to disable GPU rendering enough to make VS Code open, but it did not mean the system was actually fixed. A later freeze from that debug boot still showed another NVIDIA-side kernel fault.

For that later NoDRM crash, this is the relevant kernel output:

Apr 05 13:02:21 pop-os kernel: #PF: error_code(0x0000) - not-present page
Apr 05 13:02:21 pop-os kernel: RIP: 0010:_nv010161rm+0x3c/0x340 [nvidia]
Apr 05 13:02:21 pop-os kernel:  ? rm_get_gpu_uuid+0x28/0x150 [nvidia]
Apr 05 13:02:21 pop-os kernel:  ? nv_procfs_read_gpu_info+0x14f/0x330 [nvidia]
Apr 05 13:02:21 pop-os kernel: RIP: 0010:_nv035204rm+0xac/0x130 [nvidia]

So even with nvidia_drm blocked, the base nvidia driver path still appears to hit a kernel page fault.

I also tested whether the dGPU powers on and enumerates on PCI:

01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP107M [GeForce GTX 1050 Ti Mobile] [10de:1c8c] (rev a1)
01:00.1 Audio device [0403]: NVIDIA Corporation GP107GL High Definition Audio Controller [10de:0fb9] (rev a1)

I also tested whether the dGPU powers on, is reachable, and whether the PCIe link trains correctly:

system76-power graphics power on
lspci -nn | grep -iE 'vga|3d|nvidia'
lspci -vv -s 01:00.0 | grep -E 'LnkCap|LnkSta'

Output:

01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP107M [GeForce GTX 1050 Ti Mobile] [10de:1c8c] (rev a1)
01:00.1 Audio device [0403]: NVIDIA Corporation GP107GL High Definition Audio Controller [10de:0fb9] (rev a1)
LnkCap: Port #0, Speed 8GT/s, Width x16
LnkSta: Speed 8GT/s, Width x16

This suggests that:

  • The NVIDIA GPU still powers on and enumerates on PCI
  • The PCIe link trains correctly at 8.0 GT/s x16

Question

At this point I feel like I’m running out of debugging steps I can think of. I also don’t have a strong mental model of how the NVIDIA driver stack (kernel module, DRM, firmware, etc.) actually initializes during boot, so I’m not sure where to go deeper.

From what I’ve tested so far, does this at least suggest the GPU hardware is fine, and this is more likely a driver/kernel issue? Any help regarding what I can try next to debug this further in a meaningful way?

If you’re having issues on 580 and an extremely old driver AND on windows 11 it sure doesn’t sound like a software issue, but a hardware one

Hey, I tested it again. nvidia-470 was freezing the system because it’s not compatible with kernel 6.17.9. I purged 470 and first installed nvidia-590 which printed “The NVIDIA GeForce GTX 1050 Ti GPU is not supported by the 590.48.01 driver. Use 580.xx Legacy.” Then I installed nvidia-580, it detected the GPU but failed with Xid 62 (falcon init failure) on every boot. Then I ran the GPU with nouveau and was able to run glmark2 with the NVIDIA GPU, score 541, OpenGL 4.3 working fine.

So the issue seems to be compatibility of the NVIDIA proprietary driver with kernel 6.17.9. Can I get help regarding which NVIDIA driver I should use on Pop OS 24.04 to work with GTX 1050 Ti?