Why is it so confusing to update nvidia drivers for an older Tesla C2070?

faraz · January 27, 2021, 9:05pm

I’ve spent half a day trying to reconcile cuda/nvidia drivers with centos 7 compatibility. Don’t really know what is compatible with what. Trying to avoid copying/pasting commands as root and breaking my system. Below I copied what nvidia-smi shows.

I want to upgrade the cuda library so that I can run tensorflow which seems to require version 8+. Right now it is at version 7 I believe.

I went to nvidia website to download NVIDIA-Linux-x86_64-410.129-diagnostic.run for my TeslaC2070. When I run it I get this scary message. So I cancelled it.

"WARNING: The NVIDIA Quadro FX 580 GPU installed in this system is supported through the NVIDIA 340.xx legacy Linux graphics drivers. Please visit Unix Drivers | NVIDIA for more information. The 410.129 NVIDIA Linux graphics driver will ignore this GPU. "

±----------------------------------------------------------------------------+
| Compute processes: GPU Memory |
| GPU PID Process name Usage |
|=============================================================================|
| 0 Not Supported |
±----------------------------------------------------------------------------+

njuffa · January 27, 2021, 9:33pm

Before you do anything else, carefully check the Tensorflow hardware requirements. I don’t think Tensorflow supports these outdated GPUs. A Tesla C2070 has compute capability 2.0. I don’t think that is sufficient to run Tensorflow, but don’t take my word for it, check the official documentation.

This is an ancient GPU from 2009 with compute capability 1.1. It is not supported by the 410 driver, only by the much older 340 driver, as the warning message tells you. You can only run with one NVIDIA driver at any given time. So the driver support (or lack thereof) for the oldest GPU in the system will limit how recent a driver you can install.

faraz · January 27, 2021, 9:57pm

Thanks. I checked and Tensorflow supports these hardware below. The confusing part is what is my “cuda architecture”. I believe my libcuda is 7.0 ? So is that what I am running? I also remember seeing some error message saying my cuda is 3.0 ? Not sure which it is…

NVIDIA® GPU card with CUDA® architectures 3.5, 5.0, 6.0, 7.0, 7.5, 8.0 and higher than 8.0

faraz · January 27, 2021, 10:03pm

I see. Let’s say I install the latest driver. Does this mean nvidia will only be able to see and use the the C2070? Will the OS still be able to use the Quadro FX 580 ? I presume the Quadro is used to hook up a dual monitor to. Unfortunately I do not have the machine in hand. I am helping someone else who is remote.

njuffa · January 27, 2021, 10:04pm

What they mean by “CUDA architecture” is “compute capability” in NVIDIA terminology. I don’t know why the Tensorflow folks didn’t use the long-established terminology, thus increasing the probability of confusion.

The minimum compute capability required by Tensorflow is 3.5, the Tesla C2070 has 2.0. You will need newer GPU hardware. Since older GPU architectures become obsolete all the time, I would suggest acquiring GPUs with compute capability 6.0 or higher.

njuffa · January 27, 2021, 10:09pm

The latest available drivers will support neither the Quadro FX580 nor the Tesla C2070. Both of these are utterly obsolete. Given the current rate of technological progress in the GPU field, the active usable life-time of a GPU is about five years. Beyond that you’ll run into issues with requiring all sorts of legacy software components (old CUDA versions, old drivers, old OS version, old host toolchains) to operate the hardware. Not worth the hassle.

If there is an issue of financial constraint, consider deploying fairly recent second-hand (used) hardware.

faraz · January 27, 2021, 10:15pm

Ok but my understanding is that running NVIDIA-Linux-x86_64-410.129-diagnostic.run will support the C2070 but not the FX580. Since I risk breaking the system I did not run it . The nvidia website told me to download that file for the C2070.

I see your point about this stuff being obsolete. But for now I am curious to see what kind of performance I can get out of the C2070. I understand the compute capability is 2.0 which means I will probably need to install an older tensorflow version.

njuffa · January 27, 2021, 10:20pm

If you asked the NVIDIA site for the latest driver that supports the C2070, it recommended the appropriate driver version. But that driver does not support the Quadro FX580 (which is even older), which would be inoperable under that driver version. If you need a working Quadro FX580 in the system for graphics output, you would certainly have a problem.

Generally speaking, if you need all GPUs installed in a system to be in working order, you would want to ask the NVIDIA site for the latest package supporting the oldest GPU in the system.