tl;dr given two Quadro 600, kernel crash during xinit “BUG: unable to handle kernel NULL pointer dereference at 0000000000000220” during OS startup.
I tried to setup a three monitor screen configuration for X. Using /usr/bin/nvidia-settings I created a /etc/X11/xorg.conf file.
The process is to use nvidia-settings to create the xorg.conf file, then reboot.
It successfully ran only once. That is, three monitors were in-use on a single X screen. I was able to drag a window between all of the monitors.
I regenerated the xorg.conf with some tweaks and then reboot. After that, the OS kernel crashes during bootup.
This hardware and software setup runs without crashing when using other configurations including Xinerama. The bug seems to be influenced by dual Graphics Card and BaseMosaic setting.
Hardware setup:
- Two Quadro 600 Graphics cards
- ThinkStation D30
- Three monitors: Dell U3011, HP ZR30w, Dell P2419H
- Graphics Card at PCI:3:0:0 connected to HP ZR30w (over DPI), Dell P2419H (over DVI-I)
- Graphics Card at PCI:4:0:0 connected to Dell U3011 (over DPI)
Software setup:
- Ubuntu MATE 18.04.2 LTS
- Linux kernel 4.18.0-21-generic #22~18.04.1-Ubuntu SMP Thu May 16 15:07:19 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
- nvidia driver version 390.116
- X Server version 11.0
- X Server Vendor version 1.20.1
/etc/X11/xorg.conf (generated by nvidia-settings)
# nvidia-settings: X configuration file generated by nvidia-settings
# nvidia-settings: version 390.77 (buildd@lcy01-amd64-022) Thu Sep 6 07:51:39 UTC 2018
Section "ServerLayout"
Identifier "Layout0"
Screen 0 "Screen0" 0 0
InputDevice "Keyboard0" "CoreKeyboard"
InputDevice "Mouse0" "CorePointer"
Option "Xinerama" "0"
EndSection
Section "Files"
EndSection
Section "Module"
Load "dbe"
Load "extmod"
Load "type1"
Load "freetype"
Load "glx"
EndSection
Section "InputDevice"
# generated from default
Identifier "Mouse0"
Driver "mouse"
Option "Protocol" "auto"
Option "Device" "/dev/psaux"
Option "Emulate3Buttons" "no"
Option "ZAxisMapping" "4 5"
EndSection
Section "InputDevice"
# generated from default
Identifier "Keyboard0"
Driver "kbd"
EndSection
Section "Monitor"
# HorizSync source: edid, VertRefresh source: edid
Identifier "Monitor0"
VendorName "Unknown"
ModelName "HP ZR30w"
HorizSync 49.3 - 98.7
VertRefresh 59.9 - 60.0
Option "DPMS"
EndSection
Section "Device"
Identifier "Device0"
Driver "nvidia"
VendorName "NVIDIA Corporation"
BoardName "Quadro 600"
BusID "PCI:3:0:0"
EndSection
Section "Screen"
Identifier "Screen0"
Device "Device0"
Monitor "Monitor0"
DefaultDepth 24
Option "Stereo" "0"
Option "nvidiaXineramaInfoOrder" "DFP-0"
Option "metamodes" "GPU-0c7a8c32-5e99-9414-de72-14532e382a7d.DVI-I-1: nvidia-auto-select +1080+320, GPU-0c7a8c32-5e99-9414-de72-14532e382a7d.DP-1: nvidia-auto-select +0+0 {rotation=right}, GPU-4d8d152a-ff12-a227-2d9d-412d85748d56.DP-1: nvidia-auto-select +3640+320"
Option "MultiGPU" "Off"
Option "SLI" "off"
Option "BaseMosaic" "on"
SubSection "Display"
Depth 24
EndSubSection
EndSection
/var/log/syslog section with crash
Jun 17 09:51:23 wkstn-A kernel: [ 72.556150] resource sanity check: requesting [mem 0x000c0000-0x000fffff], which spans more than PCI Bus 0000:00 [mem 0x000c0000-0x000dffff window]
Jun 17 09:51:23 wkstn-A kernel: [ 72.556346] caller os_map_kernel_space.part.8+0xe1/0x130 [nvidia] mapping multiple BARs
Jun 17 09:51:24 wkstn-A kernel: [ 73.179862] resource sanity check: requesting [mem 0x000c0000-0x000fffff], which spans more than PCI Bus 0000:00 [mem 0x000c0000-0x000dffff window]
Jun 17 09:51:24 wkstn-A kernel: [ 73.180059] caller os_map_kernel_space.part.8+0xe1/0x130 [nvidia] mapping multiple BARs
Jun 17 09:51:24 wkstn-A kernel: [ 73.333410] do_IRQ: 12.34 No irq handler for vector
Jun 17 09:51:25 wkstn-A kernel: [ 73.429368] BUG: unable to handle kernel NULL pointer dereference at 0000000000000220
Jun 17 09:51:25 wkstn-A kernel: [ 73.429369] PGD 0 P4D 0
Jun 17 09:51:25 wkstn-A kernel: [ 73.429372] Oops: 0000 [#1] SMP PTI
Jun 17 09:51:25 wkstn-A kernel: [ 73.429375] CPU: 17 PID: 3233 Comm: Xorg Tainted: P OE 4.18.0-21-generic #22~18.04.1-Ubuntu
Jun 17 09:51:25 wkstn-A kernel: [ 73.429376] Hardware name: LENOVO 4223W1H/LENOVO, BIOS A1KT44AUS 12/18/2012
Jun 17 09:51:25 wkstn-A kernel: [ 73.429541] RIP: 0010:_nv011408rm+0x22/0x90 [nvidia]
Jun 17 09:51:25 wkstn-A kernel: [ 73.429541] Code: 0f 1f 84 00 00 00 00 00 41 56 49 89 ce 41 55 41 89 d5 41 54 49 89 f4 53 48 83 ec 08 48 85 c9 74 60 48 8b 96 30 1d 00 00 31 db <80> ba 20 02 00 00 00 74 2c 0f 1f 44 00 00 0f b6 f3 48 89 d7 ff 52
Jun 17 09:51:25 wkstn-A kernel: [ 73.429567] RSP: 0018:ffff9d81c59e7c18 EFLAGS: 00010246
Jun 17 09:51:25 wkstn-A kernel: [ 73.429569] RAX: ffff90e94e862e60 RBX: 0000000000000000 RCX: ffff90e94e862e4c
Jun 17 09:51:25 wkstn-A kernel: [ 73.429570] RDX: 0000000000000000 RSI: ffff90e955040008 RDI: ffff90e94e9dc008
Jun 17 09:51:25 wkstn-A kernel: [ 73.429571] RBP: ffff90e94e862e48 R08: 0000000000000000 R09: ffff90e94e862e60
Jun 17 09:51:25 wkstn-A kernel: [ 73.429572] R10: 0000000000000000 R11: ffffffffc06154d0 R12: ffff90e955040008
Jun 17 09:51:25 wkstn-A kernel: [ 73.429572] R13: 0000000000000007 R14: ffff90e94e862e4c R15: ffff90e94e862e60
Jun 17 09:51:25 wkstn-A kernel: [ 73.429574] FS: 00007ff49e583a80(0000) GS:ffff90e96fcc0000(0000) knlGS:0000000000000000
Jun 17 09:51:25 wkstn-A kernel: [ 73.429575] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jun 17 09:51:25 wkstn-A kernel: [ 73.429576] CR2: 0000000000000220 CR3: 0000000465bc0001 CR4: 00000000000606e0
Jun 17 09:51:25 wkstn-A kernel: [ 73.429577] Call Trace:
Jun 17 09:51:25 wkstn-A kernel: [ 73.429722] ? _nv011404rm+0x12/0x40 [nvidia]
Jun 17 09:51:25 wkstn-A kernel: [ 73.429989] ? _nv030689rm+0x4b/0x110 [nvidia]
Jun 17 09:51:25 wkstn-A kernel: [ 73.430189] ? _nv001209rm+0x115/0x430 [nvidia]
Jun 17 09:51:25 wkstn-A kernel: [ 73.430388] ? _nv001060rm+0x220/0x3c0 [nvidia]
Jun 17 09:51:25 wkstn-A kernel: [ 73.430586] ? _nv001080rm+0x299/0x330 [nvidia]
Jun 17 09:51:25 wkstn-A kernel: [ 73.430785] ? rm_disable_adapter+0x6a/0x130 [nvidia]
Jun 17 09:51:25 wkstn-A kernel: [ 73.430896] ? nv_shutdown_adapter+0x1c/0xa0 [nvidia]
Jun 17 09:51:25 wkstn-A kernel: [ 73.431007] ? nv_close_device+0xd8/0x120 [nvidia]
Jun 17 09:51:25 wkstn-A kernel: [ 73.431118] ? nvidia_close+0xd7/0x370 [nvidia]
Jun 17 09:51:25 wkstn-A kernel: [ 73.431229] ? nvidia_frontend_close+0x2f/0x50 [nvidia]
Jun 17 09:51:25 wkstn-A kernel: [ 73.431232] ? __fput+0xea/0x220
Jun 17 09:51:25 wkstn-A kernel: [ 73.431234] ? ____fput+0xe/0x10
Jun 17 09:51:25 wkstn-A kernel: [ 73.431236] ? task_work_run+0x9d/0xc0
Jun 17 09:51:25 wkstn-A kernel: [ 73.431239] ? exit_to_usermode_loop+0xed/0xf0
Jun 17 09:51:25 wkstn-A kernel: [ 73.431241] ? do_syscall_64+0x107/0x120
Jun 17 09:51:25 wkstn-A kernel: [ 73.431245] ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
Jun 17 09:51:25 wkstn-A kernel: [ 73.431246] Modules linked in: pci_stub vboxpci(OE) vboxnetadp(OE) vboxnetflt(OE) vboxdrv(OE) nfsv3 nfs fscache vmnet(OE) vmw_vsock_vmci_transport vsock vmw_vmci vmmon(OE) aufs overlay binfmt_misc nvidia_uvm(POE) intel_rapl sb_edac snd_hda_codec_hdmi x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc snd_hda_codec_realtek wmi_bmof snd_usb_audio snd_hda_codec_generic aesni_intel aes_x86_64 crypto_simd snd_hda_intel cryptd snd_usbmidi_lib snd_hda_codec glue_helper snd_seq_midi snd_seq_midi_event snd_rawmidi snd_hda_core snd_hwdep intel_cstate joydev input_leds snd_seq intel_rapl_perf serio_raw snd_pcm lpc_ich snd_seq_device snd_timer snd soundcore mei_me mei mac_hid wmi sch_fq_codel nfsd auth_rpcgss nfs_acl lockd grace sunrpc
Jun 17 09:51:25 wkstn-A kernel: [ 73.431278] parport_pc ppdev lp parport ip_tables x_tables autofs4 dm_mirror dm_region_hash dm_log nvidia_drm(POE) nvidia_modeset(POE) nvidia(POE) drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops isci hid_generic drm uas ahci usbhid libsas ipmi_devintf psmouse hid e1000e usb_storage libahci scsi_transport_sas pata_acpi ipmi_msghandler
Jun 17 09:51:25 wkstn-A kernel: [ 73.431296] CR2: 0000000000000220
Jun 17 09:51:25 wkstn-A kernel: [ 73.431298] ---[ end trace f785f2aa20aae612 ]---
Jun 17 09:51:25 wkstn-A kernel: [ 74.597073] RIP: 0010:_nv011408rm+0x22/0x90 [nvidia]
Jun 17 09:51:25 wkstn-A kernel: [ 74.597078] Code: 0f 1f 84 00 00 00 00 00 41 56 49 89 ce 41 55 41 89 d5 41 54 49 89 f4 53 48 83 ec 08 48 85 c9 74 60 48 8b 96 30 1d 00 00 31 db <80> ba 20 02 00 00 00 74 2c 0f 1f 44 00 00 0f b6 f3 48 89 d7 ff 52
Jun 17 09:51:25 wkstn-A kernel: [ 74.597178] RSP: 0018:ffff9d81c59e7c18 EFLAGS: 00010246
Jun 17 09:51:25 wkstn-A kernel: [ 74.597179] RAX: ffff90e94e862e60 RBX: 0000000000000000 RCX: ffff90e94e862e4c
Jun 17 09:51:25 wkstn-A kernel: [ 74.597180] RDX: 0000000000000000 RSI: ffff90e955040008 RDI: ffff90e94e9dc008
Jun 17 09:51:25 wkstn-A kernel: [ 74.597181] RBP: ffff90e94e862e48 R08: 0000000000000000 R09: ffff90e94e862e60
Jun 17 09:51:25 wkstn-A kernel: [ 74.597182] R10: 0000000000000000 R11: ffffffffc06154d0 R12: ffff90e955040008
Jun 17 09:51:25 wkstn-A kernel: [ 74.597183] R13: 0000000000000007 R14: ffff90e94e862e4c R15: ffff90e94e862e60
Jun 17 09:51:25 wkstn-A kernel: [ 74.597185] FS: 00007ff49e583a80(0000) GS:ffff90e96fcc0000(0000) knlGS:0000000000000000
Jun 17 09:51:25 wkstn-A kernel: [ 74.597186] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jun 17 09:51:25 wkstn-A kernel: [ 74.597187] CR2: 0000000000000220 CR3: 0000000465bc0001 CR4: 00000000000606e0
Jun 17 09:51:25 wkstn-A kernel: [ 74.598165] general protection fault: 0000 [#2] SMP PTI
Jun 17 09:51:25 wkstn-A kernel: [ 74.598168] CPU: 17 PID: 3233 Comm: Xorg Tainted: P D OE 4.18.0-21-generic #22~18.04.1-Ubuntu
Jun 17 09:51:25 wkstn-A kernel: [ 74.598169] Hardware name: LENOVO 4223W1H/LENOVO, BIOS A1KT44AUS 12/18/2012
Jun 17 09:51:25 wkstn-A kernel: [ 74.598386] RIP: 0010:_nv007220rm+0x25/0x90 [nvidia]
Jun 17 09:51:25 wkstn-A kernel: [ 74.598386] Code: 00 00 00 00 00 31 c9 48 85 ff 53 48 89 fb 74 0d 48 85 d2 74 08 48 63 47 08 48 8d 0c 10 48 8b 03 31 d2 0f 1f 00 48 85 c0 74 11 <48> 39 30 48 89 c2 76 47 48 8b 40 10 48 85 c0 75 ef 48 85 d2 48 89
Jun 17 09:51:25 wkstn-A kernel: [ 74.598412] RSP: 0018:ffff9d81c59e7b98 EFLAGS: 00010086
Jun 17 09:51:25 wkstn-A kernel: [ 74.598413] RAX: df89480875c08400 RBX: ffffffffc1087798 RCX: ffff90e964a6afc8
Jun 17 09:51:25 wkstn-A kernel: [ 74.598414] RDX: ffffffffc225234f RSI: 0000000000000ca1 RDI: ffffffffc1087798
Jun 17 09:51:25 wkstn-A kernel: [ 74.598415] RBP: ffff90e964a6ae70 R08: ffffbd7db00c4a20 R09: ffff9d81c59e7cac
Jun 17 09:51:25 wkstn-A kernel: [ 74.598416] R10: 00000000c1d00000 R11: ffff90e94e917c08 R12: 0000000000000ca1
Jun 17 09:51:25 wkstn-A kernel: [ 74.598417] R13: 0000000000000000 R14: 0000000100000002 R15: ffff90e964a6af78
Jun 17 09:51:25 wkstn-A kernel: [ 74.598419] FS: 0000000000000000(0000) GS:ffff90e96fcc0000(0000) knlGS:0000000000000000
Jun 17 09:51:25 wkstn-A kernel: [ 74.598420] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jun 17 09:51:25 wkstn-A kernel: [ 74.598421] CR2: 0000000000000220 CR3: 0000000313a0a004 CR4: 00000000000606e0
Jun 17 09:51:25 wkstn-A kernel: [ 74.598421] Call Trace:
Jun 17 09:51:25 wkstn-A kernel: [ 74.598633] ? _nv025888rm+0x13/0x50 [nvidia]
Jun 17 09:51:25 wkstn-A kernel: [ 74.598838] ? _nv035590rm+0x144/0x1e0 [nvidia]
Jun 17 09:51:25 wkstn-A kernel: [ 74.598841] ? _cond_resched+0x19/0x40
Jun 17 09:51:25 wkstn-A kernel: [ 74.599046] ? _nv007322rm+0x50/0x100 [nvidia]
Jun 17 09:51:25 wkstn-A kernel: [ 74.599244] ? rm_kernel_rmapi_op+0x8d/0x150 [nvidia]
Jun 17 09:51:25 wkstn-A kernel: [ 74.599358] ? nvidia_modeset_rm_ops_alloc_stack+0x1e/0x50 [nvidia]
Jun 17 09:51:25 wkstn-A kernel: [ 74.599378] ? nvkms_call_rm+0x4f/0x80 [nvidia_modeset]
Jun 17 09:51:25 wkstn-A kernel: [ 74.599394] ? _nv002526kms+0x51/0x60 [nvidia_modeset]
Jun 17 09:51:25 wkstn-A kernel: [ 74.599404] ? _nv000367kms+0x55/0xb0 [nvidia_modeset]
Jun 17 09:51:25 wkstn-A kernel: [ 74.599415] ? _nv002253kms+0x1f/0x40 [nvidia_modeset]
Jun 17 09:51:25 wkstn-A kernel: [ 74.599428] ? _nv002512kms+0x85/0xe0 [nvidia_modeset]
Jun 17 09:51:25 wkstn-A kernel: [ 74.599442] ? _nv002350kms+0x61/0x3a0 [nvidia_modeset]
Jun 17 09:51:25 wkstn-A kernel: [ 74.599455] ? _nv002274kms+0x6a/0xd0 [nvidia_modeset]
Jun 17 09:51:25 wkstn-A kernel: [ 74.599464] ? _nv000342kms+0xbf/0x100 [nvidia_modeset]
Jun 17 09:51:25 wkstn-A kernel: [ 74.599474] ? nvKmsClose+0xab/0x170 [nvidia_modeset]
Jun 17 09:51:25 wkstn-A kernel: [ 74.599484] ? nvkms_close_common+0x21/0x60 [nvidia_modeset]
Jun 17 09:51:25 wkstn-A kernel: [ 74.599493] ? nvkms_close+0x1a/0x30 [nvidia_modeset]
Jun 17 09:51:25 wkstn-A kernel: [ 74.599604] ? nvidia_frontend_close+0x2f/0x50 [nvidia]
Jun 17 09:51:25 wkstn-A kernel: [ 74.599606] ? __fput+0xea/0x220
Jun 17 09:51:25 wkstn-A kernel: [ 74.599607] ? ____fput+0xe/0x10
Jun 17 09:51:25 wkstn-A kernel: [ 74.599609] ? task_work_run+0x9d/0xc0
Jun 17 09:51:25 wkstn-A kernel: [ 74.599612] ? do_exit+0x2eb/0xb30
Jun 17 09:51:25 wkstn-A kernel: [ 74.599614] ? exit_to_usermode_loop+0xed/0xf0
Jun 17 09:51:25 wkstn-A kernel: [ 74.599617] ? rewind_stack_do_exit+0x17/0x20
Jun 17 09:51:25 wkstn-A kernel: [ 74.599618] Modules linked in: pci_stub vboxpci(OE) vboxnetadp(OE) vboxnetflt(OE) vboxdrv(OE) nfsv3 nfs fscache vmnet(OE) vmw_vsock_vmci_transport vsock vmw_vmci vmmon(OE) aufs overlay binfmt_misc nvidia_uvm(POE) intel_rapl sb_edac snd_hda_codec_hdmi x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc snd_hda_codec_realtek wmi_bmof snd_usb_audio snd_hda_codec_generic aesni_intel aes_x86_64 crypto_simd snd_hda_intel cryptd snd_usbmidi_lib snd_hda_codec glue_helper snd_seq_midi snd_seq_midi_event snd_rawmidi snd_hda_core snd_hwdep intel_cstate joydev input_leds snd_seq intel_rapl_perf serio_raw snd_pcm lpc_ich snd_seq_device snd_timer snd soundcore mei_me mei mac_hid wmi sch_fq_codel nfsd auth_rpcgss nfs_acl lockd grace sunrpc
Jun 17 09:51:25 wkstn-A kernel: [ 74.599649] parport_pc ppdev lp parport ip_tables x_tables autofs4 dm_mirror dm_region_hash dm_log nvidia_drm(POE) nvidia_modeset(POE) nvidia(POE) drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops isci hid_generic drm uas ahci usbhid libsas ipmi_devintf psmouse hid e1000e usb_storage libahci scsi_transport_sas pata_acpi ipmi_msghandler
Jun 17 09:51:25 wkstn-A kernel: [ 74.599665] ---[ end trace f785f2aa20aae613 ]---
Jun 17 09:51:25 wkstn-A kernel: [ 74.603137] RIP: 0010:_nv011408rm+0x22/0x90 [nvidia]
Jun 17 09:51:25 wkstn-A kernel: [ 74.603138] Code: 0f 1f 84 00 00 00 00 00 41 56 49 89 ce 41 55 41 89 d5 41 54 49 89 f4 53 48 83 ec 08 48 85 c9 74 60 48 8b 96 30 1d 00 00 31 db <80> ba 20 02 00 00 00 74 2c 0f 1f 44 00 00 0f b6 f3 48 89 d7 ff 52
Jun 17 09:51:25 wkstn-A kernel: [ 74.603163] RSP: 0018:ffff9d81c59e7c18 EFLAGS: 00010246
Jun 17 09:51:25 wkstn-A kernel: [ 74.603165] RAX: ffff90e94e862e60 RBX: 0000000000000000 RCX: ffff90e94e862e4c
Jun 17 09:51:25 wkstn-A kernel: [ 74.603166] RDX: 0000000000000000 RSI: ffff90e955040008 RDI: ffff90e94e9dc008
Jun 17 09:51:25 wkstn-A kernel: [ 74.603167] RBP: ffff90e94e862e48 R08: 0000000000000000 R09: ffff90e94e862e60
Jun 17 09:51:25 wkstn-A kernel: [ 74.603168] R10: 0000000000000000 R11: ffffffffc06154d0 R12: ffff90e955040008
Jun 17 09:51:25 wkstn-A kernel: [ 74.603169] R13: 0000000000000007 R14: ffff90e94e862e4c R15: ffff90e94e862e60
Jun 17 09:51:25 wkstn-A kernel: [ 74.603170] FS: 0000000000000000(0000) GS:ffff90e96fcc0000(0000) knlGS:0000000000000000
Jun 17 09:51:25 wkstn-A kernel: [ 74.603171] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jun 17 09:51:25 wkstn-A kernel: [ 74.603172] CR2: 0000000000000220 CR3: 0000000313a0a004 CR4: 00000000000606e0
Jun 17 09:51:25 wkstn-A kernel: [ 74.603173] Fixing recursive fault but reboot is needed!
Attached is most of the output of nvidia-bug-report.sh.
nvidia-bug-report-1055774.txt.gz (94.7 KB)