Hi, I switched RTX 6000 Ada dedicated to AI/ML workloads to compute mode via v1.59.0 “displaymodeselector --gpumode compute”, it went successfully (log attached). Sadly, the host system doesn’t seem to support this mode and none of the recent drivers are loading. Also latest “displaymodeselector --listgpumodes” hangs or gives an error:
terminate called after throwing an instance of ‘std::runtime_error’
what(): A timeout occurred while waiting for uproc response.
Aborted
Driver is also not installing. Previously, in graphics mode, driver was loading fine. I suspect the error comes from BIOS as we’re using consumer-grade system. I’ve attached dmesg, nvidia-install.log and dmidecode.
What options do we have, should we use a certified system that supports RTX 6000 Ada in compute mode or is there a way to switch the GPU back to graphics mode?
The board seems to support the rtx fine but the gpu is not responding anymore. Seems something went wrong while reflashing. Please try powering off the system and removing power, possibly even removing the rtx from its slot and let in uncharge. Then put it back in and see if it comes alive again.
Thank you! This helped, we actually have 2 identical cards both in compute mode, after letting them discharge we installed them separately into same PCI slot and they loaded drivers.
Now, when we installed them together into 1st and 3rd slot, only one is detected. If we disable the first card, second is still loading driver. If we disable the second GPU PCI slot, first isn’t detected.
So, we have identical cards, one is totally fine and stable, another was working when we let it discharge and then stopped loading driver again.
What do you think would be the right approach to get both cards working? Attaching necessary debug data.