2 cards issue

Hi,

I have a machine with 2 cards: GT 710 and Tesla T4.
I installed the 415 drivers and the screen works ok, however - nvidia-smi only shows the GT 710, not the Tesla. This is on Ubuntu 18.04

On CentOS 7.5 - everything flies, nvidia-smi shows both cards without any problems.

How can I fix it?

nvidia-bug-report.log.gz (1.02 MB)

Please run nvidia-bug-report.sh as root and attach the resulting .gz file to your post. Hovering the mouse over an existing post of yours will reveal a paperclip icon.
[url]https://devtalk.nvidia.com/default/topic/1043347/announcements/attaching-files-to-forum-topics-posts/[/url]

uploaded.

Looks like a kernel problem, it fails to assign memory windows:

[    0.200725] pci 0000:16:00.0: resource 14 [mem 0x93000000-0x94ffffff] released
[    0.200726] pci 0000:16:00.0: PCI bridge to [bus 17]
[    0.200730] pci 0000:16:00.0: bridge window [mem 0x08000000-0x13fffffff 64bit pref] to [bus 17] add_size 128000000 add_align 10000000
[    0.200732] pci 0000:16:00.0: bridge window [mem 0x01000000-0x02ffffff] to [bus 17] add_size 1000000 add_align 1000000
[    0.200734] pci 0000:16:00.0: BAR 15: no space for [mem size 0x260000000 64bit pref]
[    0.200735] pci 0000:16:00.0: BAR 15: failed to assign [mem size 0x260000000 64bit pref]
[    0.200736] pci 0000:16:00.0: BAR 14: assigned [mem 0x93000000-0x95ffffff]
[    0.200737] pci 0000:16:00.0: BAR 15: no space for [mem size 0x138000000 64bit pref]
[    0.200738] pci 0000:16:00.0: BAR 15: failed to assign [mem size 0x138000000 64bit pref]
[    0.200739] pci 0000:16:00.0: BAR 14: assigned [mem 0x93000000-0x94ffffff]
[    0.200740] pci 0000:16:00.0: BAR 14: reassigned [mem 0x93000000-0x95ffffff] (expanded by 0x1000000)
[    0.200742] pci 0000:17:00.0: BAR 1: no space for [mem size 0x10000000 64bit pref]
[    0.200743] pci 0000:17:00.0: BAR 1: failed to assign [mem size 0x10000000 64bit pref]
[    0.200744] pci 0000:17:00.0: BAR 8: no space for [mem size 0x100000000 64bit pref]
[    0.200745] pci 0000:17:00.0: BAR 8: failed to assign [mem size 0x100000000 64bit pref]
[    0.200746] pci 0000:17:00.0: BAR 3: assigned [mem 0x94000000-0x95ffffff 64bit pref]
[    0.200751] pci 0000:17:00.0: BAR 10: no space for [mem size 0x20000000 64bit pref]
[    0.200752] pci 0000:17:00.0: BAR 10: failed to assign [mem size 0x20000000 64bit pref]
[    0.200753] pci 0000:17:00.0: BAR 0: assigned [mem 0x93000000-0x93ffffff]
[    0.200755] pci 0000:17:00.0: BAR 7: no space for [mem size 0x00400000]
[    0.200756] pci 0000:17:00.0: BAR 7: failed to assign [mem size 0x00400000]
[    0.200757] pci 0000:17:00.0: BAR 1: no space for [mem size 0x10000000 64bit pref]
[    0.200758] pci 0000:17:00.0: BAR 1: failed to assign [mem size 0x10000000 64bit pref]
[    0.200759] pci 0000:17:00.0: BAR 3: assigned [mem 0x94000000-0x95ffffff 64bit pref]
[    0.200764] pci 0000:17:00.0: BAR 0: assigned [mem 0x93000000-0x93ffffff]
[    0.200766] pci 0000:17:00.0: BAR 7: no space for [mem size 0x00400000]
[    0.200767] pci 0000:17:00.0: BAR 7: failed to assign [mem size 0x00400000]
[    0.200768] pci 0000:17:00.0: BAR 10: no space for [mem size 0x20000000 64bit pref]
[    0.200769] pci 0000:17:00.0: BAR 10: failed to assign [mem size 0x20000000 64bit pref]
[    0.200770] pci 0000:17:00.0: BAR 8: no space for [mem size 0x100000000 64bit pref]
[    0.200771] pci 0000:17:00.0: BAR 8: failed to assign [mem size 0x100000000 64bit pref]
[    0.200771] pci 0000:16:00.0: PCI bridge to [bus 17]
[    0.200773] pci 0000:16:00.0:   bridge window [mem 0x93000000-0x95ffffff]
[    0.200777] pci_bus 0000:16: Automatically enabled pci realloc, if you have problem, try booting with pci=realloc=off

Should be reported to Ubuntu. Meanwhile, you can try using a different kernel from the kernel ppa.

Found the solution, it’s in the text:

Add to /etc/default/grub in the GRUB_CMDLINE_LINUX_DEFAULT the parameter pci=realloc=off

Save the file, and run update-grub to recreate the boot menu.

Now nvidia-smi show both cards ;)

2 Likes