Sudden Screen Shutdown Issue with NVIDIA RTX A2000 GPU

Hi,
I have a PC running Linux (Ubuntu 22.04) using an NVIDIA RTX A2000 GPU.

  • Kernel Version: Linux bmt9636 6.8.0-45-generic #45~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Wed Sep 11 15:25:05 UTC 2 x86_64 x86_64 x86_64 GNU/Linux
  • NVIDIA driver: 560.28.03

The system generally works fine, but occasionally, the screen shuts down suddenly, even when not performing heavy tasks. We’ve tried running tests to reproduce the issue, but it occurs sporadically and without a clear pattern.
I saved the NVIDIA related logs from the last time this happened. It is attached here:
nvidia_log.txt (15.5 KB)

Some notable errors from the logs are:

  • kernel: [drm:nv_drm_gem_export_nvkms_memory_ioctl [nvidia_drm]] ERROR [nvidia-drm] [GPU ID 0x00000100] Failed to export memory from NVKMS GEM object: 0x00000001
  • os_dump_stack+0xe/0x20 [nvidia]
  • _nv012948rm+0x2c5/0x590 [nvidia]

Can anyone provide insight into this issue or suggest potential fixes?