We have been able to hide the bhyve’s / kvm signature and it seems that it worked,even if I’m not totally sure,since I’m not able to turn on my external monitor. But I suspect that this does not depend on the old error.
So,on Debian Linux. I’ve attached an old monitor to the HDMI port of the graphic card and I tried to run “startx” from the Linux terminal and this is what happened :
root@marietto-BHYVE:~# nvidia-xconfig --query-gpu-info
Number of GPUs: 1
GPU #0:
Name : NVIDIA GeForce RTX 2080 Ti
UUID : GPU-74c6f81e-9e6c-f279-2878-2d28f4333f6b
PCI BusID : PCI:0:3:0
Number of Display Devices: 1
Display Device 0 (TV-2):
EDID Name : Samsung SyncMaster
Minimum HorizSync : 30.000 kHz
Maximum HorizSync : 81.000 kHz
Minimum VertRefresh : 56 Hz
Maximum VertRefresh : 75 Hz
Maximum PixelClock : 140.000 MHz
Maximum Width : 1280 pixels
Maximum Height : 1024 pixels
Preferred Width : 1280 pixels
Preferred Height : 1024 pixels
Preferred VertRefresh : 60 Hz
Physical Width : 340 mm
Physical Height : 270 mm
root@marietto-BHYVE:~# nvidia-xconfig
WARNING: Unable to locate/open X configuration file.
New X configuration file written to '/etc/X11/xorg.conf'
root@marietto-BHYVE:~# nano /etc/X11/xorg.conf
# nvidia-xconfig: X configuration file generated by nvidia-xconfig
# nvidia-xconfig: version 470.57.02
Section "ServerLayout"
Identifier "Layout0"
Screen 0 "Screen0"
InputDevice "Keyboard0" "CoreKeyboard"
InputDevice "Mouse0" "CorePointer"
EndSection
Section "Files"
EndSection
Section "InputDevice"
# generated from default
Identifier "Mouse0"
Driver "mouse"
Option "Protocol" "auto"
Option "Device" "/dev/psaux"
Option "Emulate3Buttons" "no"
Option "ZAxisMapping" "4 5"
EndSection
Section "InputDevice"
# generated from default
Identifier "Keyboard0"
Driver "kbd"
EndSection
Section "Monitor"
Identifier "Monitor0"
VendorName "Unknown"
ModelName "Unknown"
Option "DPMS"
EndSection
Section "Device"
Identifier "Device0"
Driver "nvidia"
BusID "PCI:0:3:0"
VendorName "NVIDIA Corporation"
EndSection
Section "Screen"
Identifier "Screen0"
Device "Device0"
Monitor "Monitor0"
DefaultDepth 24
SubSection "Display"
Depth 24
EndSubSection
EndSection
root@marietto-BHYVE:~# startx
X.Org X Server 1.20.11
X Protocol Version 11, Revision 0
Build Operating System: linux Debian
Current Operating System: Linux marietto-BHYVE 5.10.0-9-amd64 #1 SMP Debian 5.10.70-1 (2021-09-30) x86_64
Kernel command line: BOOT_IMAGE=/boot/vmlinuz-5.10.0-9-amd64 root=UUID=a33689a9-06d9-4bba-9e75-fdd1831b7e48 ro quiet splash resume=UUID=044c5b8f-f086-4491-9606-3bfa409b9d7
3
Build Date: 13 April 2021 04:07:31PM
xorg-server 2:1.20.11-1 (https://www.debian.org/support)
Current version of pixman: 0.40.0
Before reporting problems, check http://wiki.x.org
to make sure that you have the latest version.
Markers: (--) probed, (**) from config file, (==) default setting,
(++) from command line, (!!) notice, (II) informational,
(WW) warning, (EE) error, (NI) not implemented, (??) unknown.
(==) Log file: "/var/log/Xorg.0.log", Time: Wed Nov 3 09:41:43 2021
(==) Using config file: "/etc/X11/xorg.conf"
(==) Using config directory: "/etc/X11/xorg.conf.d"
(==) Using system config directory "/usr/share/X11/xorg.conf.d"
(EE)
Fatal server error:
(EE) AddScreen/ScreenInit failed for driver 0
(EE)
(EE)
Please consult the The X.Org Foundation support
at http://wiki.x.org
for help.
(EE) Please also check the log file at "/var/log/Xorg.0.log" for additional information.
(EE)
(EE) Server terminated with error (1). Closing log file.
xinit: giving up
xinit: unable to connect to X server: Connection refused
xinit: server error
I've attached the Xorg log file. My monitor does not turn on at all. I've thought that the cause was that my monitor was very very old,but I've used another monitor and I saw the same error.
NVIDIA(0): Failed to allocate shared surface
The nvidia driver depends on some cpu features, maybe those are nott announced by bhyve. Please post the output of:
cat /proc/cpuinfo
Would u like to compare the previous Xorg log file with a working Xorg config ? (I got the log file going under Linux + qemu + kvm and I wrote startx like I did on FreeBSD + bhyve and there it worked (under Linux). log file attached.
Usually, that error shows up when pat support is either not advertised by tthe cpu or is disabled. Doesn’t seem to be the case here. Please attach a full dmesg output from boot-up.
can you be more specific about what it is needed to make work the nvidia driver ? does it requires MTRR to be enabled ? Disabling MTRR and assuming all memory to be uncacheable hurts ?
Yes, the xorg graphics driver needs mtrr (and pat, which depends on mtrr) to run.
Might be interesting to see if wayland and cuda would work without it but that’s more an academical question.
I guess your goal is a fully working system. So it while might be interesting whether wayland works on a half-working system, it doesn’t help reaching the goal.
You say that xorg need MTRR / PAT. You don’t say that nvidia driver needs them. BUT, I’ve been able to use the nouveau driver with the 2080 ti passed through and xorg worked correctly. So,the obvius question is : why xorg does not need of MTRR / PAT when it finds nouveau and it needs them when it finds the nvidia driver ? Xorg is still the same and bhyve has been the same (regarding the lacks of the MTRR / PAT features). What changes is the driver used.
I said the xorg driver (DDX) needs it and we’re talking about the nvidia driver here. I just mentioned xorg to make a contrast to the (nvidia) kernel driver noticeable.
Read: the nvidia DDX needs it.
Maybe you now understand the “academical question” a bit more which would be “does only the (nvidia) ddx or also the (nvidia) kernel drm kms depend on mtrr?”
MTRRs are a no-op in bhyve - memory is still cacheable since that is now controlled by the secondary EPT tables. Maybe this is not enough good for the driver.
hi @generix. we have enabled mtrr and pat,but it still does not work. We are investigating the reasons. In the meantime,I wanna give you some details,maybe you can understand what could be wrong at the moment :
root@marietto-BHYVE:~# cpuid -l 0x40000000
CPU 0:
hypervisor_id = "� d "
CPU 1:
hypervisor_id = "� d "
CPU 2:
hypervisor_id = "� d "
CPU 3:
hypervisor_id = "� d "
CPU 4:
hypervisor_id = "� d "
CPU 5:
hypervisor_id = "� d "
CPU 6:
hypervisor_id = "� d "
CPU 7:
hypervisor_id = "� d "
As you can see, there’s no bhyve signature. Additionally as you can see below,mtrr and pat are enabled,but it reports also a non expected value,the guest hypervisor is off and we don’t know why. Instead,on the previous change that we made,with mtrr and pat off,the hypervisor was active. The guest hypervisor state configued to off could depends by the fact that we have enabled them ?
Previously, when the kernel module worked, you were camouflaging as kvm, now you pretend to be bare-metal but maybe not good enough, does it work with mtrr+kvm camouflage?
We got it. My monitor turned on and blender recognizes CUDA and the GPU. A heartfelt thanks to you too, because your advices have been precise and professional. Some pictures below :
Anyway there is still something that does not work. The nvidia audio device that I pass through bhyve does not work inside Debian. I’m talking about this device :