Hi,
I’ve been struggling to get K40 cards detected under Ubuntu 21.10 on an HP DL580 G7.
I’ve tried to reseat the cards (I have a total of 4 donated K40 cards), tried to insert 1 card at a time and double check the power connections (tried 1x 8pin and 1x8pin +1x6pin at the same time)
Sharing troubleshooting information below, have two k40 cards in the machine.
Would appreciate any pointers, thanks in advance.
The results of nvidia-smi are No devices were found. See below:
$ sudo nvidia-smi
No devices were found
I ran nvidia-bug-report.sh and the results are attached as nvidia-bug-report.log.gz
I was poking in the output of dmesg and am highlighting some sections:
[ 44.495239] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 470.103.01 Thu Jan 6 12:10:04 UTC 2022
[ 44.518197] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms 470.103.01 Thu Jan 6 12:12:52 UTC 2022
[ 44.762777] [drm] ib test succeeded in 0 usecs
[ 44.763943] [drm] No TV DAC info found in BIOS
[ 44.764026] [drm] Radeon Display Connectors
[ 44.764034] [drm] Connector 0:
[ 44.764039] [drm] VGA-1
[ 44.764045] [drm] DDC: 0x60 0x60 0x60 0x60 0x60 0x60 0x60 0x60
[ 44.764054] [drm] Encoders:
[ 44.764059] [drm] CRT1: INTERNAL_DAC1
[ 44.764066] [drm] Connector 1:
[ 44.764071] [drm] VGA-2
[ 44.764075] [drm] DDC: 0x6c 0x6c 0x6c 0x6c 0x6c 0x6c 0x6c 0x6c
[ 44.764119] [drm] Encoders:
[ 44.764124] [drm] CRT2: INTERNAL_DAC2
[ 44.809273] [drm] fb mappable at 0xA8040000
[ 44.809283] [drm] vram apper at 0xA8000000
[ 44.809289] [drm] size 1572864
[ 44.809294] [drm] fb depth is 16
[ 44.809300] [drm] pitch is 2048
[ 44.994647] [drm] Initialized radeon 2.50.0 20080528 for 0000:01:03.0 on minor 0
[ 44.995428] [drm] [nvidia-drm] [GPU ID 0x00001100] Loading driver
[ 44.995625] [drm] Initialized nvidia-drm 0.0.0 20160202 for 0000:11:00.0 on minor 1
[ 44.996569] [drm] [nvidia-drm] [GPU ID 0x00000b00] Loading driver
[ 44.996759] [drm] Initialized nvidia-drm 0.0.0 20160202 for 0000:0b:00.0 on minor 2
[ 1525.198747] resource sanity check: requesting [mem 0x91700000-0x926fffff], which spans more than PCI Bus 0000:0b [mem 0x91000000-0x91ffffff]
[ 1525.198762] caller os_map_kernel_space.part.0+0x97/0xa0 [nvidia] mapping multiple BARs
[ 1525.253780] NVRM: GPU 0000:0b:00.0: RmInitAdapter failed! (0x24:0xffff:1211)
[ 1525.253893] NVRM: GPU 0000:0b:00.0: rm_init_adapter failed, device minor number 1
[ 1526.050048] resource sanity check: requesting [mem 0x91700000-0x926fffff], which spans more than PCI Bus 0000:0b [mem 0x91000000-0x91ffffff]
[ 1526.050057] caller os_map_kernel_space.part.0+0x97/0xa0 [nvidia] mapping multiple BARs
[ 1526.104605] NVRM: GPU 0000:0b:00.0: RmInitAdapter failed! (0x24:0xffff:1211)
[ 1526.104688] NVRM: GPU 0000:0b:00.0: rm_init_adapter failed, device minor number 1
[ 1527.344396] resource sanity check: requesting [mem 0x90700000-0x916fffff], which spans more than PCI Bus 0000:11 [mem 0x90000000-0x90ffffff]
[ 1527.344406] caller os_map_kernel_space.part.0+0x97/0xa0 [nvidia] mapping multiple BARs
[ 1527.398940] NVRM: GPU 0000:11:00.0: RmInitAdapter failed! (0x24:0xffff:1211)
[ 1527.399023] NVRM: GPU 0000:11:00.0: rm_init_adapter failed, device minor number 0
[ 1528.192348] resource sanity check: requesting [mem 0x90700000-0x916fffff], which spans more than PCI Bus 0000:11 [mem 0x90000000-0x90ffffff]
[ 1528.192358] caller os_map_kernel_space.part.0+0x97/0xa0 [nvidia] mapping multiple BARs
[ 1528.247464] NVRM: GPU 0000:11:00.0: RmInitAdapter failed! (0x24:0xffff:1211)
[ 1528.247551] NVRM: GPU 0000:11:00.0: rm_init_adapter failed, device minor number 0
[ 2259.710444] resource sanity check: requesting [mem 0x91700000-0x926fffff], which spans more than PCI Bus 0000:0b [mem 0x91000000-0x91ffffff]
[ 2259.710459] caller os_map_kernel_space.part.0+0x97/0xa0 [nvidia] mapping multiple BARs
[ 2259.766210] NVRM: GPU 0000:0b:00.0: RmInitAdapter failed! (0x24:0xffff:1211)
[ 2259.766323] NVRM: GPU 0000:0b:00.0: rm_init_adapter failed, device minor number 1
[ 2260.570910] resource sanity check: requesting [mem 0x91700000-0x926fffff], which spans more than PCI Bus 0000:0b [mem 0x91000000-0x91ffffff]
[ 2260.570920] caller os_map_kernel_space.part.0+0x97/0xa0 [nvidia] mapping multiple BARs
[ 2260.626208] NVRM: GPU 0000:0b:00.0: RmInitAdapter failed! (0x24:0xffff:1211)
[ 2260.626292] NVRM: GPU 0000:0b:00.0: rm_init_adapter failed, device minor number 1
[ 2261.422806] resource sanity check: requesting [mem 0x90700000-0x916fffff], which spans more than PCI Bus 0000:11 [mem 0x90000000-0x90ffffff]
[ 2261.422815] caller os_map_kernel_space.part.0+0x97/0xa0 [nvidia] mapping multiple BARs
[ 2261.478261] NVRM: GPU 0000:11:00.0: RmInitAdapter failed! (0x24:0xffff:1211)
[ 2261.478343] NVRM: GPU 0000:11:00.0: rm_init_adapter failed, device minor number 0
nvidia-bug-report.log.gz (87.3 KB)
The output of lsb_release is below:
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 21.10
Release: 21.10
Codename: impish
The output of uname is below:
$ uname -a
Linux nlp002 5.13.0-39-generic #44-Ubuntu SMP Thu Mar 24 15:35:05 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux