"NVRM: Graphics TEX Exception - TEX LAYOUT" while experimenting with dma_bufs

OS: Ubuntu Studio 22.04.1 LTS
GPU: NVIDIA GeForce GTX 1060/PCIe/SSE2
Driver version: 525

Context:

I’ve written a quick and dirty test program to experiment with the Nvidia driver support for EGL_MESA_image_dma_buf_export and EGL_EXT_image_dma_buf_import. The goal is to find out if they’ll be useful in an application I’m designing.

Right now I know for certain that my program is not correct. In particular I know I’m not handling the “modifier” field correctly. There are almost certainly more bugs in it too. I think it’s very likely that the problem actually originates in my code, however it doesn’t seem right that things fail this spectacularly.

The problem appears to be reproducible. My test program is linked against a patched open source library (wlroots) and I’m afraid that stripping it down to a minimal compilable reproducer will take a lot of work (and frankly this is the kind of work I’d normally expect to be paid for); however it probably won’t be too hard to produce a statically linked binary for you, if this would be helpful.

Observed behaviour:

My test program opens its window and then the X session mostly freezes. The mouse pointer is still moveable but it is not possible to interact with any windows etc. Switching away from the X session using Ctrl+Alt+Fn appears to succeed, but a couple of seconds later, the X server crashes and restarts.

After the restart I found some interesting looking messages in /var/log/kern.log:

Jan 22 12:50:20 rigel kernel: [174329.748769] NVRM: GPU at PCI:0000:01:00: GPU-8dd71d72-003b-00eb-98a5-37373698493c
Jan 22 12:50:20 rigel kernel: [174329.748787] NVRM: Xid (PCI:0000:01:00): 13, pid=‘’, name=, NVRM: Graphics TEX Exception on (GPC 0, TPC 0): TEX LAYOUT
Jan 22 12:50:20 rigel kernel: [174329.748824] NVRM: Xid (PCI:0000:01:00): 13, pid=‘’, name=, Graphics Exception: ESR 0x504224=0x80000009 0x504228=0x20380001 0x50422c=0x102efbe 0x504234=0x60001b60
Jan 22 12:50:20 rigel kernel: [174329.748873] NVRM: Xid (PCI:0000:01:00): 13, pid=‘’, name=, NVRM: Graphics TEX Exception on (GPC 0, TPC 0): TEX LAYOUT
Jan 22 12:50:20 rigel kernel: [174329.748897] NVRM: Xid (PCI:0000:01:00): 13, pid=‘’, name=, Graphics Exception: ESR 0x504224=0x80000009 0x504228=0x20380010 0x50422c=0x102efbb 0x504234=0x60001b60
Jan 22 12:50:20 rigel kernel: [174329.748929] NVRM: Xid (PCI:0000:01:00): 13, pid=‘’, name=, NVRM: Graphics TEX Exception on (GPC 0, TPC 1): TEX LAYOUT
Jan 22 12:50:20 rigel kernel: [174329.748947] NVRM: Xid (PCI:0000:01:00): 13, pid=‘’, name=, Graphics Exception: ESR 0x504a24=0x80000009 0x504a28=0x20380010 0x504a2c=0x102efbf 0x504a34=0x60001b60
Jan 22 12:50:20 rigel kernel: [174329.748974] NVRM: Xid (PCI:0000:01:00): 13, pid=‘’, name=, NVRM: Graphics TEX Exception on (GPC 0, TPC 1): TEX LAYOUT

This goes on and on for hundreds of lines, with a range of values for (GPC n, TPC n), and with a few of these scattered through it:

Jan 22 12:50:20 rigel kernel: [174329.749894] NVRM: Xid (PCI:0000:01:00): 13, pid=2430, name=Xorg, Graphics Exception: ChID 0018, Class 0000902d, Offset 000008dc, Data 00000000

Expected behaviour:

I’m not entirely sure, but I’d expect an API call to return an error and the X session to remain basically operational.

Bug dump:

nvidia-bug-report.log.gz (1.8 MB)

(This was produced after the restart. Hopefully that didn’t destroy any useful information!)