Wondering if RTX 5000 (razer notebook) works on Linux?

Been trying to get this working for days on Qubes OS Linux Distro… very painful. Wanted to see if anyone knows if the current Linux drivers support this card on Notebook. I am using Linux x64 (AMD64/EM64T) Display Driver VERSION 415.25 …

Any help is VERY much appreciated… thanks you!

I don’t think that’ll work with Qubes OS, that’s a hypervisor based virtual system so you have to passthrough the gpu to the guest but you can’t since it’s used by the hypervisor.

Thanks for the reply. Do you know if that card will work with other distros? If so, are there any that you recommend will work well?

I’ll ask with a bit more detail for clarity.

  • Irrespective of Qubes, do you know if the RTX 5000 works on ‘any’ Linux distros? If so is there one that is recommended or able to be setup with the least complication?
  • The lowest level VM in Qubes is called ‘dom0’. It allows you to install drivers and interface directly with hardware.

It works in all distros. For convenience, you should coose one that provides a packaged driver without much hassle.

Thanks for your help thus far. Providing the driver for RTX5000 works as you’re indicating, it should be possible to have the driver work on Qubes. According to the official Qubes documentation here, it is possible for the OS to work with NVIDIA cards.

The main Qubes workspace is called ‘dom0’ which gives you access to all hardware and device drivers. It’s not recommended to install software at this layer for security purposes but it is totally possible for things like video cards, etc.

After fighting with this for a few days, I’m starting to make some progress but not quite there yet. I will post some key files and logs and hopefully you or someone on these forums can help.

Distro: GNU/LInux (Fedora)
Kernel: 4.19.107-1.pvops.qubes.x86_64
Card: NVIDIA RTX 5000 Quatro on Razer Blade 15 (2019)
Driver: NVRM NVIDIDA UNix x86_64 Kernel Module 415.18

Nouveau:

  1. Nouveau has been blacklisted in /etc/modprobe.d
  2. ‘lsmod | grep nouveau’ returns empty`

’lsmod | nvidia’ returns:
nvidia_drm 53248 0
nvidia_modeset 1040384 1 nvidia_drm
nvidia 17281024 1 nvidia_modeset
ipmi_msghandler 61440 2 ipmi_devintf,nvidia
drm_kms_helper 200704 2 nvidia_drm,i915
drm 487424 6 drm_kms_helper,nvidia_drm,i915

’lspci -v | grep VGA -A 12’ returns:

01:00.0 VGA compatible controller: NVIDIA Corporation Device 1eb5 (rev a1) (prog-if 00 [VGA controller])
Subsystem: Razer USA Ltd. Device 2008
Flags: bus master, fast devsel, latency 0, IRQ 16
Memory at 57000000 (32-bit, non-prefetchable) [size=16M]
Memory at 6030000000 (64-bit, prefetchable) [size=256M]
Memory at 6040000000 (64-bit, prefetchable) [size=32M]
I/O ports at 3000 [size=128]
Expansion ROM at 58000000 [disabled] [size=512K]
Capabilities:
Kernel driver in use: nvidia
Kernel modules: nouveau, nvidia_drm, nvidia

dmesg attached as uploaddmesg.log (102.9 KB)

Basically it is failing to run when I load X windows with init 5 from command line. The error messages are in the attached dmesg log at the end.

I also get no running processes found when I run ‘nvidia-smi’ from command line, and can’t open nvidia-settings even though it is installed.

±----------------------------------------------------------------------------+
| NVIDIA-SMI 415.18 Driver Version: 415.18 CUDA Version: 10.0 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Graphics Device Off | 00000000:01:00.0 Off | N/A |
| N/A 60C P0 3W / N/A | 0MiB / 16095MiB | 0% Default |
±------------------------------±---------------------±---------------------+

±----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
±----------------------------------------------------------------------------+

It’s an Optimus (hybrid graphics) notebook so you’ll have to set up PRIME output:
https://devtalk.nvidia.com/default/topic/1022670/linux/official-driver-384-59-with-geforce-1050m-doesn-t-work-on-opensuse-tumbleweed-kde/post/5203910/#5203910
Furthermore, you should rather use a newer driver than v415.

Okay, uninstalled other driver using the --uninstall cmd on the previous driver’s .run cmd.

Updated driver to:

NVRM version: NVIDIA UNIX x86_64 Kernel Module  440.82
GCC version:  gcc version 6.4.1 20170727 (Red Hat 6.4.1-1) (GCC) 

Updated to PRIME (see updated xorg.conf):

  • xrandr v = 1.5.0
  • X version = 1.19.3

Getting the following errors in dmesg when grep nvidia:

[  327.912289] nvidia: loading out-of-tree module taints kernel.
[  327.912295] nvidia: module license 'NVIDIA' taints kernel.
[  327.955691] nvidia-nvlink: Nvlink Core is being initialized, major device number 235
[  327.959794] nvidia 0000:01:00.0: enabling device (0000 -> 0003)
[  327.959937] nvidia 0000:01:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none
[  328.052450] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms  440.82  Wed Apr  1 19:41:29 UTC 2020
[  328.057596] [drm] [nvidia-drm] [GPU ID 0x00000100] Loading driver
[  328.057598] [drm] Initialized nvidia-drm 0.0.0 20160202 for 0000:01:00.0 on minor 1

Getting the following errors from /var/log/Xorg.0.conf when grep nvidia:

[   222.377] (**) |-->Screen "nvidia" (0)
[   222.377] (**) |   |-->Device "nvidia"
[   222.377] (**) |   |-->GPUDevice "nvidia"
[   222.417] (II) LoadModule: "nvidia"
[   222.417] (II) Loading /usr/lib64/xorg/modules/drivers/nvidia_drv.so
[   222.418] (II) Module nvidia: vendor="NVIDIA Corporation"
[   222.461] (II) Loading sub module "glxserver_nvidia"
[   222.461] (II) LoadModule: "glxserver_nvidia"
[   222.461] (II) Loading /usr/lib64/xorg/modules/extensions/libglxserver_nvidia.so
[   222.487] (II) Module glxserver_nvidia: vendor="NVIDIA Corporation"
[   223.249] (==) NVIDIA(0): No modes were requested; the default mode "nvidia-auto-select"
[   223.251] (II) NVIDIA(0):     "DFP-4:nvidia-auto-select"

X org start seems to fail here according to logs:

[   223.361] (II) NVIDIA: Using 24576.00 MB of virtual memory for indirect memory
[   223.361] (II) NVIDIA:     access.
[   223.375] (II) NVIDIA(0): ACPI: failed to connect to the ACPI event daemon; the daemon
[   223.375] (II) NVIDIA(0):     may not be running or the "AcpidSocketPath" X
[   223.375] (II) NVIDIA(0):     configuration option may not be set correctly.  When the
[   223.375] (II) NVIDIA(0):     ACPI event daemon is available, the NVIDIA X driver will
[   223.375] (II) NVIDIA(0):     try to use it to receive ACPI event notifications.  For
[   223.375] (II) NVIDIA(0):     details, please see the "ConnectToAcpid" and
[   223.375] (II) NVIDIA(0):     "AcpidSocketPath" X configuration options in Appendix B: X
[   223.375] (II) NVIDIA(0):     Config Options in the README.
[   223.379] (EE) NVIDIA(0): Failed to allocate software rendering cache surface: out of
[   223.379] (EE) NVIDIA(0):     memory.
[   223.379] (EE) NVIDIA(0):  *** Aborting ***
[   223.381] (EE) 
Fatal server error:
[   223.381] (EE) NVIDIA: A GPU exception occurred during X server initialization
[   223.381] (EE) 
[   223.381] (EE) 

Attached:

  • xorg.conf
  • dmesg log
  • Xorg.0.log

dmesg.log (101.2 KB)

xorg.log (17.9 KB)

xorg_conf.log (670 Bytes)

Config looks correct but there seems to be an additional config file interfering. Please run nvidia-bug-report.sh as root and attach the resulting nvidia-bug-report.log.gz file to your post. You will have to rename the file ending to something else since the forum software doesn’t accept .gz files (nifty!).

file here… .thanks for your help! much appreciation

nvidia-bug-report.log (1.8 MB)

Any thoughts?

Something is wrong with the PAT config of the kernel and the nvidia driver needs that:
[ 19.291017] NVRM: PAT configuration unsupported.
The dmesg contains more errors about PAT being skipped, you should check why.

Been trying to figure out why. The Kernel has PAT enabled, so that is not the issue. Could it be something with Xen HyperVisor on Qubes? Could it require a License from the Source Code? Could it be anything to do with UEFI / EFI boot system?

It’s the hypervisor. Even dom0 is not like a bare-metal install
x86/PAT: MTRRs disabled, skipping PAT initialization too.
I really don’t know how to get around that.
Looking at https://lwn.net/Articles/323368/ makes me think it should work.

Do the latest versions of drivers need MTRR enabled to init PAT?

“MTRR use is replaced on modern x86 hardware with PAT. Direct MTRR use by drivers on Linux is now completely phased out, device drivers should use arch_phys_wc_add() in combination with ioremap_wc() to make MTRR effective on non-PAT systems while a no-op but equally effective on PAT enabled systems.”

https://www.kernel.org/doc/html/latest/x86/mtrr.html

No, it’s the (newer) linux kernel. If mtrr is not available, pat is also disabled.
Edit: the driver just needs pat, doesn’t care for mtrr.