Nvidia-settings gives errors 3090ti egpu dell laptop Ubuntu

Hi,

This is a clean updated install of Ubuntu 20.04.4 LTS.
I just want to use the eGPU for tensor calcs – I don’t want to render with it.

Ubuntu detects the GPU on the PCI bus:
lspci | grep -i nvidia

01:00.0 VGA compatible controller: NVIDIA Corporation TU106GLM [Quadro RTX 3000 Mobile / Max-Q] (rev a1)
01:00.1 Audio device: NVIDIA Corporation TU106 High Definition Audio Controller (rev a1)
01:00.2 USB controller: NVIDIA Corporation TU106 USB 3.1 Host Controller (rev a1)
01:00.3 Serial bus controller [0c80]: NVIDIA Corporation TU106 USB Type-C UCSI Controller (rev a1)
3d:00.0 VGA compatible controller: NVIDIA Corporation Device 2203 (rev a1)
3d:00.1 Audio device: NVIDIA Corporation Device 1aef (rev a1)

nvidia-bug-report.log.gz (729.7 KB)

However, nvidia-smi gives:

BlockquoteWed Aug 10 09:29:24 2022
±----------------------------------------------------------------------------+
| NVIDIA-SMI 515.65.01 Driver Version: 515.65.01 CUDA Version: 11.7 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Quadro RTX 3000 Off | 00000000:01:00.0 Off | N/A |
| N/A 45C P0 29W / N/A | 10MiB / 6144MiB | 0% Default |
| | | N/A |
±------------------------------±---------------------±---------------------+
| 1 NVIDIA GeForce … Off | 00000000:3D:00.0 Off | Off |
| 30% 35C P0 85W / 450W | 0MiB / 24564MiB | 1% Default |
| | | N/A |
±------------------------------±---------------------±---------------------+

±----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 1375 G /usr/lib/xorg/Xorg 4MiB |
| 0 N/A N/A 1848 G /usr/lib/xorg/Xorg 4MiB |
±----------------------------------------------------------------------------+

nvidia-settings gives the following output:

(nvidia-settings:6605): GLib-GObject-CRITICAL **: 09:29:58.112: g_object_unref: assertion ‘G_IS_OBJECT (object)’ failed

** (nvidia-settings:6605): CRITICAL **: 09:29:58.113: ctk_powermode_new: assertion ‘(ctrl_target != NULL) && (ctrl_target->h != NULL)’ failed
** Message: 09:29:58.134: PRIME: Requires offloading
** Message: 09:29:58.134: PRIME: is it supported? yes
** Message: 09:29:58.154: PRIME: Usage: /usr/bin/prime-select nvidia|intel|on-demand|query
** Message: 09:29:58.154: PRIME: on-demand mode: “1”
** Message: 09:29:58.154: PRIME: is “on-demand” mode supported? yes

How do I correct this?

Thanks!

That error is always displayed, ignore.

1 Like

Thanks for the reply Generix…
However, now the eGPU has stopped working. I guess this is related to other issues, but I would very much like some advice… It is detected with lspci, but does not appear in nvidia-smi…

dmesg | grep -i nvidia
[ 1.053338] nvidia-gpu 0000:01:00.3: enabling device (0000 → 0002)
[ 5.993961] input: HDA NVidia HDMI/DP,pcm=3 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card1/input28
[ 6.129195] nvidia: loading out-of-tree module taints kernel.
[ 6.129206] nvidia: module license ‘NVIDIA’ taints kernel.
[ 6.140186] nvidia: module verification failed: signature and/or required key missing - tainting kernel
[ 6.215343] nvidia-nvlink: Nvlink Core is being initialized, major device number 507
[ 6.216644] nvidia 0000:01:00.0: enabling device (0000 → 0003)
[ 6.217603] nvidia 0000:01:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none
[ 6.264488] audit: type=1400 audit(1660512409.043:5): apparmor=“STATUS” operation=“profile_load” profile=“unconfined” name=“nvidia_modprobe” pid=727 comm=“apparmor_parser”
[ 6.264492] audit: type=1400 audit(1660512409.043:6): apparmor=“STATUS” operation=“profile_load” profile=“unconfined” name=“nvidia_modprobe//kmod” pid=727 comm=“apparmor_parser”
[ 6.266731] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 515.65.01 Wed Jul 20 14:00:58 UTC 2022
[ 6.284536] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms 515.65.01 Wed Jul 20 13:43:59 UTC 2022
[ 6.360337] input: HDA NVidia HDMI/DP,pcm=7 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card1/input29
[ 6.360383] input: HDA NVidia HDMI/DP,pcm=8 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card1/input30
[ 6.360455] input: HDA NVidia HDMI/DP,pcm=9 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card1/input31
[ 6.360500] input: HDA NVidia HDMI/DP,pcm=10 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card1/input32
[ 6.603739] [drm] [nvidia-drm] [GPU ID 0x00000100] Loading driver
[ 6.694788] nvidia-gpu 0000:01:00.3: i2c timeout error e0000000
[ 9.141670] [drm] Initialized nvidia-drm 0.0.0 20160202 for 0000:01:00.0 on minor 1
[ 9.165739] nvidia_uvm: module uses symbols from proprietary module nvidia, inheriting taint.
[ 9.169450] nvidia-uvm: Loaded the UVM driver, major device number 505.
[ 47.476217] nvidia 0000:3d:00.0: enabling device (0000 → 0003)
[ 47.476414] nvidia 0000:3d:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none
[ 47.970007] input: HDA NVidia HDMI/DP,pcm=3 as /devices/pci0000:00/0000:00:1c.0/0000:04:00.0/0000:05:04.0/0000:3b:00.0/0000:3c:01.0/0000:3d:00.1/sound/card2/input45
[ 47.970092] input: HDA NVidia HDMI/DP,pcm=7 as /devices/pci0000:00/0000:00:1c.0/0000:04:00.0/0000:05:04.0/0000:3b:00.0/0000:3c:01.0/0000:3d:00.1/sound/card2/input46
[ 47.970180] input: HDA NVidia HDMI/DP,pcm=8 as /devices/pci0000:00/0000:00:1c.0/0000:04:00.0/0000:05:04.0/0000:3b:00.0/0000:3c:01.0/0000:3d:00.1/sound/card2/input47
[ 47.970272] input: HDA NVidia HDMI/DP,pcm=9 as /devices/pci0000:00/0000:00:1c.0/0000:04:00.0/0000:05:04.0/0000:3b:00.0/0000:3c:01.0/0000:3d:00.1/sound/card2/input48
[ 47.970388] input: HDA NVidia HDMI/DP,pcm=10 as /devices/pci0000:00/0000:00:1c.0/0000:04:00.0/0000:05:04.0/0000:3b:00.0/0000:3c:01.0/0000:3d:00.1/sound/card2/input49
[ 47.970440] input: HDA NVidia HDMI/DP,pcm=11 as /devices/pci0000:00/0000:00:1c.0/0000:04:00.0/0000:05:04.0/0000:3b:00.0/0000:3c:01.0/0000:3d:00.1/sound/card2/input50
[ 47.970496] input: HDA NVidia HDMI/DP,pcm=12 as /devices/pci0000:00/0000:00:1c.0/0000:04:00.0/0000:05:04.0/0000:3b:00.0/0000:3c:01.0/0000:3d:00.1/sound/card2/input51

lspci | grep -i nvidia
01:00.0 VGA compatible controller: NVIDIA Corporation TU106GLM [Quadro RTX 3000 Mobile / Max-Q] (rev a1)
01:00.1 Audio device: NVIDIA Corporation TU106 High Definition Audio Controller (rev a1)
01:00.2 USB controller: NVIDIA Corporation TU106 USB 3.1 Host Controller (rev a1)
01:00.3 Serial bus controller [0c80]: NVIDIA Corporation TU106 USB Type-C UCSI Controller (rev a1)
3d:00.0 VGA compatible controller: NVIDIA Corporation Device 2203 (rev a1)
3d:00.1 Audio device: NVIDIA Corporation Device 1aef (rev a1)

nvidia-smi
Mon Aug 15 07:36:07 2022
±----------------------------------------------------------------------------+
| NVIDIA-SMI 515.65.01 Driver Version: 515.65.01 CUDA Version: 11.7 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Quadro RTX 3000 Off | 00000000:01:00.0 On | N/A |
| N/A 45C P8 12W / N/A | 119MiB / 6144MiB | 17% Default |
| | | N/A |
±------------------------------±---------------------±---------------------+

±----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 1737 G /usr/lib/xorg/Xorg 22MiB |
| 0 N/A N/A 2199 G /usr/lib/xorg/Xorg 95MiB |
±----------------------------------------------------------------------------+

Thanks!
nvidia-bug-report.log.gz (605.3 KB)

NVRM: GPU 0000:3d:00.0: RmInitAdapter failed! (0x26:0x56:1440)
https://forums.developer.nvidia.com/t/device-boots-into-headless-mode/221380/2?u=generix

Thanks Generix,
I uninstalled 515 with software and updates, installing the open source driver. Restarted, rolled back to 450 with “software and updates”, set up prime, set on-demand, and restarted.

I’m still not getting the driver appearing on nvidia-smi.

I’m not sure if the magic element is using the “run” file…but I was not sure which one to use.

Thanks

OK, I downloaded the run file, and tried to follow the readme as closely as possible.
Note that runlevel 3 in Ubuntu 20.04 is not accessed as described, and the only way I seemed to be able to get into runlevel 3 was via sudo init 3.
Modifying /etc/default/grub did not do the trick.

I made a blacklist file for nouveau.

I installed the 470.82 run file, and followed through the process.

I’m still not getting the eGPU (3090 Ti) listed with nvidia-smi.

I’ve attached the new bug file…
nvidia-bug-report.log.gz (334.1 KB)

Thanks for your help.

There’s still the 450 driver loading which doesn’t support the 3090 Ti. Please run
sudo update-initramfs -u
to remove it from the initrd.

1 Like

Much appreciated. Thanks for your help.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.