GL in user namespace

Gl graphics display blank screen when ran in user namespace:

/bin/glxgears

Finishing post:

#/bin/glxgears // works as expected
#unshare -U /bin/glxgears // blank window

RH 7.6
Kernel 5.3.6
DMI: HP ProLiant DL580 Gen9/ProLiant DL580 Gen9, BIOS U17 01/22/2018

±----------------------------------------------------------------------------+
| NVIDIA-SMI 440.59 Driver Version: 440.59 CUDA Version: 10.2 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Quadro P2000 Off | 00000000:04:00.0 Off | N/A |
| 46% 33C P8 4W / 75W | 33MiB / 5059MiB | 0% Default |
±------------------------------±---------------------±---------------------+

±----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 25483 G /usr/bin/Xorg 30MiB |
±----------------------------------------------------------------------------+
Any insight greatly appreciated
Thanks
Scott

More info:
No errors reported.
Reported frame rate is the same.
It seems GL thinks it is rendering the image, but screen is blank.
Machine has 72 cores (Intel Xenon E7-8867) and 512G ram.
This works correctly on a Ubuntu machine w/NVS315 card and 390 series drivers.

Thanks
Scott

Maybe these are part of problem:

[Tue Apr 7 17:07:25 2020] NVRM: Xid (PCI:0000:04:00): 13, pid=44855, Graphics Exception on GPC 0: 3D-C MEMLAYOUT Violation. Coordinates: (0x8, 0xc)
[Tue Apr 7 17:07:25 2020] NVRM: Xid (PCI:0000:04:00): 13, pid=44855, Graphics Exception: ESR 0x500420=0x80000400 0x500434=0xc0008 0x500438=0x2a00 0x50043c=0x10000
[Tue Apr 7 17:07:25 2020] NVRM: Xid (PCI:0000:04:00): 13, pid=44855, Graphics Exception on GPC 1: 3D-C MEMLAYOUT Violation. Coordinates: (0x3, 0xc)
[Tue Apr 7 17:07:25 2020] NVRM: Xid (PCI:0000:04:00): 13, pid=44855, Graphics Exception: ESR 0x508420=0x80000400 0x508434=0xc0003 0x508438=0x2a00 0x50843c=0x10000
[Tue Apr 7 17:07:25 2020] NVRM: Xid (PCI:0000:04:00): 13, pid=44855, Graphics Exception: ChID 0010, Class 0000c197, Offset 000023c8, Data bf800000

Only occur when unshared

Same problem with 390.132 drivers. Going to try nouveau, at least I have the source and can fix it if it doesn’t work.

Hi Scott,

I tried reproducing issue on my test setup where I observed that running command “unshare -U /bin/glxgears” leaves a window blank but fps decreases drastically.

My config - Precision T7610 + Ubuntu 18.04.4 + 5.5.6 + GeForce RTX 2080 + Driver 440.82

root@oemqa-Precision-T7610:~# unshare -U glxgears &
[3] 14823
root@oemqa-Precision-T7610:~# Running synchronized to the vertical refresh. The framerate should be
approximately the same as the monitor refresh rate.
11 frames in 5.2 seconds = 2.128 FPS
10 frames in 5.2 seconds = 1.925 FPS
10 frames in 5.2 seconds = 1.926 FPS
10 frames in 5.2 seconds = 1.925 FPS
10 frames in 5.2 seconds = 1.927 FPS
10 frames in 5.2 seconds = 1.924 FPS
10 frames in 5.2 seconds = 1.926 FPS

However in your case, fps is same in both cases, correct ?
Also can you please confirm , every-time when you run “unshare -U /bin/glxgears”, you hit with Xid 13 errors because i have not observed on my setup so far althiugh screen is blank.

I tried with below configuration setup again and looks like I am able to replicate issue.

HP Z8 G4 Workstation + Ubuntu 18.04.4 + 5.3.0-45-generic + Driver 440.59 + Quadro P6000

Repro Steps Attempted -

  1. Started Bare X server

  2. Run glxgears which executed successfully and observed application running on screen.
    root@nghodake-HP-Z840-Workstation:~# Running synchronized to the vertical refresh. The framerate should be
    approximately the same as the monitor refresh rate.
    24/04/2020 12:51:20 copy_tiles: allocating first_line at size 121
    24/04/2020 12:51:20 client 1 network rate 124.0 KB/sec (1996.4 eff KB/sec)
    24/04/2020 12:51:20 client 1 latency: 39.8 ms
    24/04/2020 12:51:20 dt1: 0.0128, dt2: 0.0700 dt3: 0.0398 bytes: 10259
    24/04/2020 12:51:20 link_rate: LR_BROADBAND - 39 ms, 123 KB/s
    296 frames in 5.0 seconds = 59.111 FPS
    300 frames in 5.0 seconds = 59.990 FPS
    300 frames in 5.0 seconds = 59.997 FPS
    300 frames in 5.0 seconds = 59.998 FPS
    300 frames in 5.0 seconds = 59.997 FPS
    300 frames in 5.0 seconds = 59.996 FPS

  3. Later I ran “unshare -U /bin/glxgears” which also executed successfully but screes is blank as reported by you.
    root@nghodake-HP-Z840-Workstation:~# unshare -U glxgears
    Running synchronized to the vertical refresh. The framerate should be
    approximately the same as the monitor refresh rate.
    2 frames in 5.1 seconds = 0.391 FPS
    300 frames in 5.0 seconds = 59.967 FPS
    300 frames in 5.0 seconds = 59.996 FPS
    300 frames in 5.0 seconds = 59.998 FPS
    300 frames in 5.0 seconds = 59.997 FPS
    300 frames in 5.0 seconds = 59.996 FPS
    300 frames in 5.0 seconds = 59.999 FPS
    300 frames in 5.0 seconds = 59.997 FPS
    300 frames in 5.0 seconds = 59.997 FPS
    ^C

However, I observed Xid 31 after quitting glxgears after step 3. Can you please confirm if you always get Xid 13 error or any other errors.

root@nghodake-HP-Z840-Workstation:~# dmesg |grep -i xid
[761903.782958] NVRM: Xid (PCI:0000:2d:00): 31, pid=89943, Ch 00000010, intr 10000000. MMU Fault: ENGINE GRAPHICS GPCCLIENT_PROP_0 faulted @ 0x1_04800000. Fault is of type FAULT_PDE ACCESS_TYPE_WRITE
[761955.674472] NVRM: Xid (PCI:0000:2d:00): 31, pid=90699, Ch 00000010, intr 10000000. MMU Fault: ENGINE GRAPHICS GPCCLIENT_PROP_0 faulted @ 0x1_04800000. Fault is of type FAULT_PDE ACCESS_TYPE_WRITE

Hello,

Errors do not seem to be logged every time but usually have some kind of error.

The following occurred with the 390.132 driver running unshare –U glxgears :

Apr 23 15:31:05 jstsdxmpcv01 kernel: NVRM: Xid (PCI:0000:04:00): 31, Ch 00000010, engmask 00000105, intr 10000000

Apr 23 15:31:16 jstsdxmpcv01 kernel: NVRM: Xid (PCI:0000:04:00): 13, Graphics Exception on GPC 0: 3D-C MEMLAYOUT Violation. Coordinates: (0xfb, 0x5c)

Apr 23 15:31:16 jstsdxmpcv01 kernel: NVRM: Xid (PCI:0000:04:00): 13, Graphics Exception: ESR 0x500420=0x80000400 0x500434=0x5c00fb 0x500438=0x2a00 0x50043c=0x310000

Apr 23 15:31:16 jstsdxmpcv01 kernel: NVRM: Xid (PCI:0000:04:00): 13, Graphics Exception on GPC 1: 3D-C MEMLAYOUT Violation. Coordinates: (0x100, 0x5c)

Apr 23 15:31:16 jstsdxmpcv01 kernel: NVRM: Xid (PCI:0000:04:00): 13, Graphics Exception: ESR 0x508420=0x80000400 0x508434=0x5c0100 0x508438=0x2a00 0x50843c=0x310000

Also sometimes the frame rate will be very low, maybe when logging errors ?

Thanks

Scott

image001.jpg

Hello,

I also tested a P2000 card on Ubuntu 18.04.4 LTS + 4.15.0-96-generic + Driver 390.116 and it worked correctly, no errors logged.

I think the problem may be 5.x kernel related.

Thanks

Scott

image001.jpg

Unshare -u glxgears works as expected with nouveau driver.

Thanks
Scott

Hi Scott,

Just to confirm issue reported is blank screen when we run GL in user namespace.
We will work on it and keep you update.
We have filed a bug 200611077 internally for tracking purpose.

Hello, we have fixed the bug in our development branch. The fix will appear in a future release, and there should be a changelog entry covering it.
What trips the driver currently is that the /dev/nvidia-modeset is seen to have nobody:nobody ownership, and it expects root:root. Maybe that is enough for you to find a workaround until the fix lands.

Thanks we will test it out.

How do I get the development driver ?

Thanks

Scott

Hello,

On our machines the owner when unshared is nfsnobody:nfsnobody.

Will that break the fix ?

Thanks

Scott

This is our internal branch, you do not get a build from it directly. If your company has an existing relationship with us and a NDA, we can give you pre-release build from a release branch.
No, the owner of /dev/nvidia* will not matter after the fix. The driver will try to open it no matter what, and also handle failure gracefully.

Hello,

Any idea when the updated driver will be available ? Need for scheduling.

Could I or a NASA official sign a NDA ?

Thanks

Scott

Hello @ahuillet, I am on Gentoo Linux where the file mentioned by you has following permissions (not root:root, but video as group):
[17:47:47] zangetsu@andromeda  $  ~  ls -la /dev/nvidia-modeset
crw-rw---- 1 root video 195, 254 Oct 3 11:02 /dev/nvidia-modeset

I actually didn’t experience this behavior with gtx 1030, but obtained quadro p4000 for triple monitor work and this started happening, the behavior is different:

  1. sometimes its freezes completely my desktop
  2. sometimes it restarts my desktop to complete fresh instanstance

I use KDE Plasma. I will report it in new fresh thread again for better tracking, but this thread found via google direct search, so will bit duplicate content.

I do not understand what you are referring to when you say “this behavior”.