I’ve been having an issue and have been scratching my head at it for a couple days so I’ve come to ask for help.
I have a Ubuntu media server set up that I purchased a used GTX 1660Ti as my transcoder. I do not have a monitor attached to the machine, I suppose I run it headless, only connecting via RDP from windows or Teamviewer. This has been working well for me with a AMD graphics card, however I hadn’t been using that to transcode.
After swapping out the older AMD card, I booted up the system and installed nvidia-driver-535-open via terminal. I was hoping this would be all I need. However I rebooted the system and when I try to RDP in to the machine all I get is a black screen. Same deal with Teamviewer, where I used to be able to start my Desktop session from connecting via Teamviewer.
When SSHing in to the machine I can find some information, I’m unsure what will help so I will include what I can:
I have also attached Nvidia & Teamviewer bug reports, I presume nothing will be found from Teamviewer’s reports but just in case.
Reports were generated after a clean install of drivers > reboot > attempt to connect via RDP from windows PC (Black screen) > attempt to connect via Teamviewer from iOS (Black screen)
Scanned through the nvidia logs and found this which may help diagnosing…
Learned that I shouldn’t be installing open drivers as “Open nvidia.ko is only ready for use on Data Center GPUs.” So I will install the standard driver then report back with new logs.
/var/log/dmesg:
[ 3.562094] kernel: nvidia-gpu 0000:01:00.3: enabling device (0000 -> 0002)
[ 3.700362] kernel: nvidia-nvlink: Nvlink Core is being initialized, major device number 239
[ 3.752590] kernel: NVRM: loading NVIDIA UNIX Open Kernel Module for x86_64 535.171.04 Release Build (dvs-builder@U16-I3-B13-2-1) Tue Mar 19 20:44:31 UTC 2024
[ 3.776181] kernel: nvidia-modeset: Loading NVIDIA UNIX Open Kernel Mode Setting Driver for x86_64 535.171.04 Release Build (dvs-builder@U16-I3-B13-2-1) Tue Mar 19 20:26:58 UTC 2024
[ 3.780926] kernel: [drm] [nvidia-drm] [GPU ID 0x00000100] Loading driver
[ 3.916237] kernel: NVRM: objClInitPcieChipset: *** Chipset Setup Function Error!
[ 4.715553] kernel: nvidia-gpu 0000:01:00.3: i2c timeout error e0000000
[ 5.344028] kernel: NVRM: Open nvidia.ko is only ready for use on Data Center GPUs.
[ 5.344032] kernel: NVRM: To force use of Open nvidia.ko on other GPUs, see the
[ 5.344034] kernel: NVRM: 'OpenRmEnableUnsupportedGpus' kernel module parameter described
[ 5.344035] kernel: NVRM: in the README.
[ 5.603300] kernel: NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x62:0x0:1921)
[ 5.603824] kernel: NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0
[ 5.603882] kernel: [drm:nv_drm_load [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to allocate NvKmsKapiDevice
[ 5.603975] kernel: [drm:nv_drm_probe_devices [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to register device
[ 5.768603] kernel: nvidia-uvm: Loaded the UVM driver, major device number 237.
Attempting to use driver-535 I still get the nvidia.ko open warning so I will ignore that.
Found this in the Nvidia logs, so the driver is definitely seeing the GPU, does anything look obviously wrong here apart from all display parameters being disabled?
/usr/bin/nvidia-smi --query
==============NVSMI LOG==============
Timestamp : Mon Jun 3 07:06:25 2024
Driver Version : 535.171.04
CUDA Version : 12.2
Attached GPUs : 1
GPU 00000000:01:00.0
Product Name : NVIDIA GeForce GTX 1660 Ti
Product Brand : GeForce
Product Architecture : Turing
Display Mode : Disabled
Display Active : Disabled
Persistence Mode : Disabled
Addressing Mode : None
MIG Mode
Current : N/A
Pending : N/A
Accounting Mode : Disabled
Accounting Mode Buffer Size : 4000
Driver Model
Current : N/A
Pending : N/A
Serial Number : N/A
GPU UUID : GPU-da51989c-1ba9-a122-ed4b-30a388d78e7c
Minor Number : 0
VBIOS Version : 90.16.20.00.6C
MultiGPU Board : No
Board ID : 0x100
Board Part Number : N/A
GPU Part Number : 2182-400-A1
FRU Part Number : N/A
Unable to find fix via software, however if I have a monitor attached to the GPU everything works as expected. I will be buying a dummy HDMI plug to solve my issue.
I’m unsure what caused this issue as I already had the xrandr dummy monitor installed and set up, working on the AMD GPU… But I’m happy with this solution.