Qubes OS - Applications do not open

Hello, I have an MSI GF66 with an RTX 3050 GPU with Qubes OS. I am trying to play games. I successfully passed the GPU to the Virtual Machine and installed the driver. Now when trying to open applications they do not open.

My log attached:
nvidia-bug-report.log (142.8 KB)

OK I blacklisted the nouveau drivers and it got me past an error message in the initial logs. Here are my new logs with the same issue of applications not opening:

nvidia-bug-report.log (1.3 MB)

When I try to run nvidia-settings I get “Unable to load info from any system”

I don’t think that will work that way. Since you’re running in a virtual environment, the primary graphics is a virtual one which doesn’t support PRIME to get the output from the (secondary) nvidia gpu. You could try using virtualgl but that only works for OpenGL.

Ok I found out xorg wasn’t using the right config files. Once I was it properly loaded the NVIDIA drivers. Is there a way to disable PRIME completely?

I am encountering another error: xf86OpenConsole: VT_ACTIVATE failed: Operation not permitted.

Maybe you can help?

Thanks

-Ryan

Since your internal display has a fixed connection with the igpu only, you can only hook up and external monitor to the nvidia gpu.

So its not possible for me to have a VM running on the same screen with this laptop?

You could try to use bumblebee, maybe that will work with your vitual graphics.

Thank You

@generix Do you know how to fix

xf86OpenConsole: VT_ACTIVATE failed: Operation not permitted

Does it have to do with the standard user not being able to access the driver?

Rather with the standard user not being logged in from console. Are you trying to start an Xserver from a terminal window?

No I am not

So when does the error message show up then?

In the xorg logs. Which happens at boot.

Here are my current logs:

nvidia-bug-report.log (127.8 KB)

These are the xserver errors I am having. Which are located in the log:

____________________________________________

xset -q:

/usr/bin/nvidia-bug-report.sh: line 906: xset: command not found
____________________________________________

nvidia-settings -q all:

Unable to init server: Could not connect: Connection refused

ERROR: Unable to find display on any available system


ERROR: Unable to find display on any available system


____________________________________________

xrandr --verbose:

Can't open display :0
____________________________________________

Running window manager properties:

Unable to detect window manager properties
____________________________________________

There’re no Xserver related logs included. Doesn’t seem any Xserver is starting at all.

1 Like

LAST EDIT:
Uh, I got it working. Operation not permitted and running as user got me thinking: I should run Xorg as root. Qubes uses /usr/bin/qubes-run-xorg which is a shell script that ends like

if qsvc guivm-gui-agent; then
    DISPLAY_XORG=:1

    # Create Xorg. Xephyr will be started using qubes-start-xephyr later.
    exec runuser -u "$DEFAULT_USER" -- /bin/sh -l -c "exec $XORG $DISPLAY_XORG -nolisten tcp vt07 -wr -config xorg-qubes.conf > ~/.xorg-errors 2>&1" &
else
    # Use sh -l here to load all session startup scripts (/etc/profile, ~/.profile
    # etc) to populate environment. This is the environment that will be used for
    # all user applications and qrexec calls.
    exec /usr/bin/qubes-gui-runuser "$DEFAULT_USER" /bin/sh -l -c "exec /usr/bin/xinit $XSESSION -- $XORG :0 -nolisten tcp vt07 -wr -config xorg-qubes.conf > ~/.xsession-errors 2>&1"
fi

adding DEFAULT_USER="root" above this if statement launches xorg as root, and everything just works with the final config at the bottom.

ORIGINAL POST:
Hello, I thought I should chime in with more information about the problem as I’ve just ran into this as well. It is also very nice to know that NVIDIA is now allowing for NVIDIA GPUs to run inside of Linux VMs! Last time I tried in September I believe, I didn’t get nearly this far.

I will start from the beginning. I am attempting to get CUDA applications running in a VM in a “headless” manner. When a Qubes VM starts without an NVIDIA GPU attached, this is what the Xorg.0.log looks like: Standard Qubes Xorg.0.log (27.8 KB) and we can see Xorg is working as expected:

root         526  0.0  0.1  11504  6236 tty7     S+   10:38   0:00 /usr/bin/qubes-gui-runuser user /bin/sh -l -c exec /usr/bin/xinit /etc/X11/xinit/xinitrc -- /usr/libexec/Xorg :0 -nolisten tcp vt07 -wr -config xorg-qubes.conf > ~/.xsession-errors 2>&1
user         552  0.0  0.0   4148  1292 ?        Ss   10:38   0:00 /usr/bin/xinit /etc/X11/xinit/xinitrc -- /usr/libexec/Xorg :0 -nolisten tcp vt07 -wr -config xorg-qubes.conf
user         627  1.7  2.7 286832 110728 ?       Sl   10:38   0:36 /usr/libexec/Xorg :0 -nolisten tcp vt07 -wr -config xorg-qubes.conf

of note is that Xorg is started as user. The associated xorg-qubes.conf looks like this:

Section "Module"
        Load "fb"
EndSection

Section "ServerLayout"   
        Identifier     "Default Layout"
        Screen      0  "Screen0" 0 0  
        InputDevice "qubesdev"
EndSection

Section "Device"
        Identifier  "Videocard0"
        Driver      "dummyqbs"
        VideoRam 22501
        Option "GUIDomID" "0"
EndSection

Section "Monitor"
        Identifier "Monitor0"
        HorizSync 49-50
	VertRefresh 34-35
	Modeline "QB2560x1440" 128 2560 2561 2562 2563 1440 1441 1442 1443 
EndSection

Section "Screen"
        Identifier "Screen0"
        Device     "Videocard0"
	Monitor    "Monitor0"
        DefaultDepth     24
        SubSection "Display"
                Viewport   0 0
                Depth     24 
		Modes "QB2560x1440" 
        EndSubSection
EndSection


Section "InputDevice"
        Identifier  "qubesdev"
        Driver      "qubes"
EndSection

This file is made from a template by the qubes-gui-agent system service on service start up.

With this setup, as long as the nvidia device is not referenced in the xorg.confs, the GPU is in a strange state:

bash-5.1# nvidia-smi
Wed Feb 16 12:33:42 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.47.03    Driver Version: 510.47.03    CUDA Version: 11.6     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:00:08.0 Off |                  N/A |
| 30%   37C    P0    N/A / 220W |      0MiB /  8192MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

But torch reports cuda as being available:

(base) [user@gpu-linux ~]$ python
Python 3.9.7 (default, Sep 16 2021, 13:09:58) 
[GCC 7.5.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.cuda.is_available()
True
>>> 

And the Xorg.0.log:
Xorg.0.log (33.0 KB)

This is different from my previous experience, IIRC before, Xorg needed to know about the GPU in order for CUDA to work. That’s nice. However, I would ideally like to have Coolbits enabled. As far as I know, Coolbits are dependent on Xorg for whatever reason. So, maybe it’ll work if I just tell Xorg about the device myself?

(base) [user@gpu-linux ~]$ cat /etc/X11/xorg.conf.d/nvidia.conf 
Section "Device"
# discrete GPU NVIDIA
   Identifier      "nvidia"
   Driver          "nvidia"
   VendorName      "NVIDIA Corporation"
   BoardName       "GeForce RTX 3070"
   Option          "Coolbits" "28"
   BusID           "PCI:8:0:0"
EndSection

Restarting Xorg, pretty much does nothing. nvidia-smi can still report information, torch says CUDA is available. OK, maybe I need to make a screen out of it?

bash-5.1# cat /etc/X11/xorg.conf.d/nvidia.conf 
Section "Screen"
# virtual monitor
    Identifier     "Screen1"
# discrete GPU nvidia
    Device         "nvidia"
# virtual monitor
    Monitor        "Monitor1"
    DefaultDepth 24
    SubSection     "Display"
       Depth 24
    EndSubSection
EndSection

Section "Monitor"
    Identifier     "Monitor1"
    VendorName     "Unknown"
    Option         "DPMS"
EndSection


Section "Device"
# discrete GPU NVIDIA
   Identifier      "nvidia"
   Driver          "nvidia"
   VendorName      "NVIDIA Corporation"
   BoardName       "GeForce RTX 3070"
   Option          "Coolbits" "28"
   BusID           "PCI:8:0:0"

Of course nothing. Maybe if I add my own server layout?

bash-5.1# cat /etc/X11/xorg.conf.d/nvidia.conf 

Section "ServerLayout"
  Identifier	"Default Layout"
#   Option "AllowNVIDIAGPUScreens"
  Screen 0 "Screen0" 0 0 
  Screen 1 "Screen1"
  InputDevice "qubesdev"
EndSection

Section "Screen"
# virtual monitor
    Identifier     "Screen1"
# discrete GPU nvidia
    Device         "nvidia"
# virtual monitor
    Monitor        "Monitor1"
    DefaultDepth 24
    SubSection     "Display"
       Depth 24
    EndSubSection
EndSection

Section "Monitor"
    Identifier     "Monitor1"
    VendorName     "Unknown"
    Option         "DPMS"
EndSection


Section "Device"
# discrete GPU NVIDIA
   Identifier      "nvidia"
   Driver          "nvidia"
   VendorName      "NVIDIA Corporation"
   BoardName       "GeForce RTX 3070"
   Option          "Coolbits" "28"
   BusID           "PCI:8:0:0"
EndSection

Restarting Xorg, and it doesn’t work. This is where the xf86OpenConsole: VT_ACTIVATE failed: Operation not permitted error comes in. Perhaps my xorg.conf is very naive. I don’t really know. Heres the log
crashed Xorg.0.log (5.4 KB)
At this point I’m running in a console from another VM (qvm-console-in-dispvm gpu-linux in dom0), and nvidia-smi still seems to recognize the GPU and pytorch reports CUDA as available. Removing the nvidia.conf file and restarting Xorg works of course. Here is the full dmesg log:
dmesg.log (70.5 KB)

Another note: I’ve had to restart my computer a few times because it seems eventually the driver/hardware/something needs it. Even after removing the nvidia.conf file, it will not work at all and dmesg will contain RmInitAdapter errors. I have lost the codes for these, but if it happens again I will post.

EDIT: I found one of the errors. I do not know the context for this one. I was just trying random Xorg configurations:

[ 4655.853492] NVRM: GPU 0000:00:08.0: RmInitAdapter failed! (0x23:0x65:1401)
[ 4655.853925] NVRM: GPU 0000:00:08.0: rm_init_adapter failed, device minor number 0
[ 4659.880394] NVRM: GPU 0000:00:08.0: RmInitAdapter failed! (0x23:0x65:1401)
[ 4659.880770] NVRM: GPU 0000:00:08.0: rm_init_adapter failed, device minor number 0

And heres another one that I remember seeing, excuse the ??? as I don’t remember what these numbers were

NVRM: GPU 0000:00:08.0: RmInitAdapter failed! ([???]:[???]:1451)
?????? X_ID ????????
NVRM: GPU 0000:00:08.0: RmInitAdapter failed! ([???]:[???]:1451)

I believe 1451 was the code, maybe it was 1651 I’m not sure. Of course these numbers are opaque to me.