Ok, I understand, thx.
It is a timing problem?
The entry /dev/dri/card2 not exist.
modprobe-r nvidia says
modprobe: FATAL: Module nvidia is in use.
I think the same problem during my script is called.
First I remove it with
modprobe -r nvidia
sleep 1
remove pci and scan…
generix
December 26, 2021, 9:13pm
22
Please check if nvidia-persistenced is enabled with systemd and disable it.
Yes it is enabled
But disable or mask won’t work.
After boot it is loaded. What is this?
systemctl status nvidia-persistenced.service
● nvidia-persistenced.service - NVIDIA Persistence Daemon
Loaded: loaded (/lib/systemd/system/nvidia-persistenced.service; static)
Active: active (running) since Sun 2021-12-26 22:25:22 CET; 30s ago
Process: 935 ExecStart=/usr/bin/nvidia-persistenced --user nvidia-persistenced --no-persistence-mode --verbose (code=exited, status=0/SUCCESS)
Main PID: 938 (nvidia-persiste)
Tasks: 1 (limit: 38356)
Memory: 724.0K
CPU: 3ms
CGroup: /system.slice/nvidia-persistenced.service
└─938 /usr/bin/nvidia-persistenced --user nvidia-persistenced --no-persistence-mode --verbose
Dez 26 22:25:22 michael-MacPro systemd[1]: Starting NVIDIA Persistence Daemon...
Dez 26 22:25:22 michael-MacPro nvidia-persistenced[938]: Verbose syslog connection opened
Dez 26 22:25:22 michael-MacPro nvidia-persistenced[938]: Now running with user ID 124 and group ID 134
Dez 26 22:25:22 michael-MacPro nvidia-persistenced[938]: Started (938)
Dez 26 22:25:22 michael-MacPro nvidia-persistenced[938]: device 0000:19:00.0 - registered
Dez 26 22:25:22 michael-MacPro nvidia-persistenced[938]: Local RPC services initialized
Dez 26 22:25:22 michael-MacPro systemd[1]: Started NVIDIA Persistence Daemon.
What can I do with the command
$ nvidia-persistenced
generix
December 26, 2021, 9:43pm
24
It’s needed for headless, compute-only servers to keep the driver loaded and initialized.
Should be no problem to disable it, iirc ubuntu uses a udev rule in /lib/udev/rules.d to start it. Try removing that and run sudo update-initramfs -u to also remove it from initrd.
generix
December 26, 2021, 9:45pm
25
Or just add
systemctl stop nvidia-persistenced
at the start of your script.
Ok, the service is deactivated.
I removed it from rules.d.
systemctl status nvidia-persistenced.service
○ nvidia-persistenced.service - NVIDIA Persistence Daemon
Loaded: loaded (/lib/systemd/system/nvidia-persistenced.service; static)
Active: inactive (dead)
How does this helps?
generix
December 26, 2021, 10:00pm
27
Like said, it keeps the driver loaded, thus locked. You should now be able to unload it.
No, the message is still the same.
sudo modprobe -r nvidia_uvm nvidia_drm nvidia_modeset nvidia
modprobe: FATAL: Module nvidia is in use.
generix
December 26, 2021, 11:13pm
29
Since you disabled nvidia-persistenced and rebooted, maybe now your script is working and xorg is blocking module unloading which is fine?
pls check nvidia-smi for the xorg process.
No, I checked smi, but no process.
Now I switched back to manjaro linux.
I also need my script to remove and scan the pci bus. I don’t know why, the service won’t start at boot.
So I need to start it manually or with crontab.
But… I can use the gpu with prime-run. It is not as perfect, as I use the GPU directly, but it is better then nothing, and it’s much more faster than the amd gpu. For the moment I will use this.
Perhaps I got the service running at boot, and under manjaro the timing is better for reloading the nvidia driver?
In manjaro I have the third card.
Is it possible to switch manually to the nvidia video output?
/ ls -la /dev/dri ✔
insgesamt 0
drwxr-xr-x 3 root root 180 27. Dez 11:22 .
drwxr-xr-x 22 root root 4280 27. Dez 12:16 ..
drwxr-xr-x 2 root root 160 27. Dez 11:22 by-path
crw-rw----+ 1 root video 226, 0 27. Dez 11:22 card0
crw-rw----+ 1 root video 226, 1 27. Dez 11:22 card1
crw-rw----+ 1 root video 226, 2 27. Dez 12:59 card2
crw-rw-rw- 1 root render 226, 128 27. Dez 11:22 renderD128
crw-rw-rw- 1 root render 226, 129 27. Dez 11:22 renderD129
crw-rw-rw- 1 root render 226, 130 27. Dez 11:22 renderD130
/ ls -la /dev/dri/by-path ✔
insgesamt 0
drwxr-xr-x 2 root root 160 27. Dez 11:22 .
drwxr-xr-x 3 root root 180 27. Dez 11:22 ..
lrwxrwxrwx 1 root root 8 27. Dez 11:22 pci-0000:02:00.0-card -> ../card1
lrwxrwxrwx 1 root root 13 27. Dez 11:22 pci-0000:02:00.0-render -> ../renderD129
lrwxrwxrwx 1 root root 8 27. Dez 12:59 pci-0000:06:00.0-card -> ../card2
lrwxrwxrwx 1 root root 13 27. Dez 11:22 pci-0000:06:00.0-render -> ../renderD130
lrwxrwxrwx 1 root root 8 27. Dez 11:22 pci-0000:19:00.0-card -> ../card0
lrwxrwxrwx 1 root root 13 27. Dez 11:22 pci-0000:19:00.0-render -> ../renderD128
How does this loooks?
The script works now at boot, and the /dev/dri/card0 - 2 are available.
journal.txt (129.1 KB)
nvidia-bug-report.log (1.0 MB)
I have a screen from the eGPU now.
Providers: number : 3
Provider 0: id: 0x1b7 cap: 0x1, Source Output crtcs: 4 outputs: 4 associated providers: 2 name:NVIDIA-0
Provider 1: id: 0x243 cap: 0xf, Source Output, Sink Output, Source Offload, Sink Offload crtcs: 6 outputs: 6 associated providers: 1 name:modesetting
Provider 2: id: 0x208 cap: 0xf, Source Output, Sink Output, Source Offload, Sink Offload crtcs: 6 outputs: 6 associated providers: 1 name:modesetting
~ nvidia-smi ✔
Mon Dec 27 18:46:30 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.86 Driver Version: 470.86 CUDA Version: 11.4 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:19:00.0 On | N/A |
| 0% 55C P0 N/A / 90W | 102MiB / 4040MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 1505 G /usr/lib/Xorg 99MiB |
| 0 N/A N/A 2139 G /usr/bin/nvidia-settings 0MiB |
+-----------------------------------------------------------------------------+
I used this xorg.conf
Section "Module"
Load "modesetting"
EndSection
Section "Device"
Identifier "Device0"
Driver "nvidia"
BusID "PCI:25:0:0"
Option "AllowEmptyInitialConfiguration"
Option "AllowExternalGpus" "True"
EndSection
nvidia-settings also shows me the card infos and Displays.
But it is very slow. I think it doesn’t use the GPU for 3D, only for display. What can I do?
nvidia-bug-report.log (1.0 MB)
generix
December 27, 2021, 9:01pm
34
Your suspicion is correct, the glx driver is not found:
Failed to load module "glxserver_nvidia"
You need to set the path to it, e.g. in a “Section Files” with
ModulePath “/usr/lib/nvidia/xorg”
ModulePath “/usr/lib/xorg/modules”
1 Like
Yes that works. Great.
One small problem left.
OpenGL now use the nvidia, but vulkan use it only with the parameter “__NV_PRIME_RENDER_OFFLOAD=1 ”
glxinfo | grep vendor
server glx vendor string: NVIDIA Corporation
client glx vendor string: NVIDIA Corporation
OpenGL vendor string: NVIDIA Corporation
vkcube
WARNING: radv is not a conformant Vulkan implementation, testing use only.
WARNING: radv is not a conformant Vulkan implementation, testing use only.
Selected GPU 0: AMD RADV TAHITI, type: 2
__NV_PRIME_RENDER_OFFLOAD=1 vkcube
WARNING: radv is not a conformant Vulkan implementation, testing use only.
WARNING: radv is not a conformant Vulkan implementation, testing use only.
Selected GPU 0: NVIDIA GeForce GTX 1050 Ti, type: 2
How I can set this to automatcially use vulkan by the nvidia card?
And is it possible to optimize something? Some tipps?
generix
December 28, 2021, 12:25pm
36
Seems this is some shortcoming of Vulkan, from the render offload page:
Vulkan applications use the Vulkan API to enumerate the GPUs in the system and select which GPU to use; most Vulkan applications will use the first GPU reported by Vulkan.
You could use
export __NV_PRIME_RENDER_OFFLOAD=1
in your system/user profile.
Yes I will try that. But otherwise, it works now as it should.
Many thanks.