I have been having some issues with my desktop. After starting the computer I see a black screen after the Dell loading screen, and can only interact with the computer if entering recovery mode. If I run nvidia-smi on recovery mode, the command works and correctly recognizes the GPU. The current driver is 535.183.01 using CUDA 12.2.
I have tried seeding a different, older Geforce graphics card, which allows me to start the computer normally and interact with my screen. I contacted Dell, and they sent me a replacement RTX A4000, which runs into the same problem as the old one. So I think it is a driver problem, but I have tried many of the fixes suggested in previous threads and had no success. Please let me know if there are commands / log files to include and I would appreciate guidance. Thanks! nvidia-bug-report.log (2.0 MB)
Thanks for taking a look. Unfortunately yes, I’ve tried others, and in those cases, the Dell startup screen doesn’t show up at all. Happy to try any other suggestions
We have a similar system, Dell precision 7960 rack, with the same card RTX A4000. We also cannot get the system to recognize a DFP, it says all DFPs are disconnected. We tried 510,550 and 570 drivers on RockyLinux 8.6 and 8.9, including the Dell RH drivers. After disabling the internal (Matrox) video in the BIOS, there is also no console output on the DFP connected with displayport.
The OS recognizes the card and nvidia-smi works, there is just no output signal on any of the four DP ports and X terminates.
Could you resolve your problem or make any diagnostic progress ?
I also tried two different monitors which are confirmed working on another system.
Thanks.
PS: I attach the nvidia-bug-report which has logs of multiple attempts, currently using v510. nvidia-bug-report.log.gz (566.9 KB)
I finally got output emitted from the display port of the A4000 after nothing seemed to work. The critical step was to add the nomodeset kernel parameter. Then fully power cycling the monitor by replugging let X (and/or the OS) recognize the connection.
After that, rebooting again had the monitor even mirror the motherboard connected monitor during first power up which had not been the case (no signal ever). Possibly, the nvidia card or some other hardware component remembered somehow the earlier, successful connection.
As far as I can tell, the nomodeset kernel parameter disables graphics modes output from the internal graphics and weirdly still seems necessary even if the internal graphics is disabled in the BIOS and linux does not see that hardware.
I am not sure if I can or want to understand all that, and am just glad that there was a solution.
I see in the original bug report that nomodeset was used so there is a different issue. I did need to hard power cycle the monitor.
Also I see persistenced in the bug report which was flagged as problematic in another thread which seemed related (may search for that).