I am using a mirrored copy of the nvidia ubuntu (focal) repos and trying to figure out how I can bring my install up to the latest possible build that supports my dated GPUs (sm35 and sm37), which have obviously been removed from >R470 drivers, but the documentation makes it sound like the actual cuda libs have only deprecated, not removed support for these keplar generation gpus.
The two things I feel like I should be able to try and do, are either:
A) install R470 drivers with cuda-11-{5-7}
or
B) install cuda-11.4.4 from the up-to-date repo, which appears to not be possible
$ sudo apt install cuda
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following additional packages will be installed:
ca-certificates-java cuda-11-7 [snip] nvidia-dkms-515 nvidia-driver-515 nvidia-kernel-common-515 [snip]
$ sudo apt install cuda=11.4.4-1 (or cuda-11-4 meta-package)
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following additional packages will be installed:
ca-certificates-java cuda-11-4 [snip] nvidia-dkms-515 nvidia-driver-515 nvidia-kernel-common-515 [snip]
$ sudo apt install cuda=11.4.4-1 cuda-drivers-470
Reading package lists... Done
Building dependency tree
Reading state information... Done
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:
The following packages have unmet dependencies:
cuda : Depends: cuda-11-4 (>= 11.4.4) but it is not going to be installed
E: Unable to correct problems, you have held broken packages.
$ sudo apt install cuda-11-4 cuda-drivers-470
Reading package lists... Done
Building dependency tree
Reading state information... Done
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:
The following packages have unmet dependencies:
cuda-11-4 : Depends: cuda-runtime-11-4 (>= 11.4.4) but it is not going to be installed
Depends: cuda-demo-suite-11-4 (>= 11.4.100) but it is not going to be installed
E: Unable to correct problems, you have held broken packages.
Right now I’m forced to live with my 11.4.2 snapshot, because 11.4.3 (and 11.4.4) were released after 11.5.0 (and 11.6.0, respectively).
So I’m just curious if this is just not a supported/possible installation type, which seems odd to release a newer 11.4 version that you can’t effectively install it for older model gpus, unless the only purpose of the later point releases is to have newer drivers with older libraries?
Also, to be clear, this isn’t 11.4.x specific, trying to install 11.x.y where x<4 also results in the same scenario, where the R515 drivers are set to be installed.
Installing the R470 drivers before installing the cuda packages also triggers the R515 drivers to be installed, and the R470 drivers to be removed.
$ sudo apt install cuda=11.4.4-1
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following package was automatically installed and is no longer required:
libnvidia-common-470
Use 'sudo apt autoremove' to remove it.
The following additional packages will be installed:
ca-certificates-java cuda-11-4 cuda-cccl-11-4 cuda-command-line-tools-11-4 cuda-compiler-11-4 cuda-cudart-11-4 cuda-cudart-dev-11-4 cuda-cuobjdump-11-4 cuda-cupti-11-4 cuda-cupti-dev-11-4 cuda-cuxxfilt-11-4 cuda-demo-suite-11-4
cuda-documentation-11-4 cuda-driver-dev-11-4 cuda-drivers cuda-drivers-515 cuda-gdb-11-4 cuda-libraries-11-4 cuda-libraries-dev-11-4 cuda-memcheck-11-4 cuda-nsight-11-4 cuda-nsight-compute-11-4 cuda-nsight-systems-11-4 cuda-nvcc-11-4
cuda-nvdisasm-11-4 cuda-nvml-dev-11-4 cuda-nvprof-11-4 cuda-nvprune-11-4 cuda-nvrtc-11-4 cuda-nvrtc-dev-11-4 cuda-nvtx-11-4 cuda-nvvp-11-4 cuda-runtime-11-4 cuda-samples-11-4 cuda-sanitizer-11-4 cuda-toolkit-11-4
cuda-toolkit-11-4-config-common cuda-toolkit-11-config-common cuda-toolkit-config-common cuda-tools-11-4 cuda-visual-tools-11-4 default-jre default-jre-headless fonts-dejavu-extra gds-tools-11-4 java-common libatk-wrapper-java
libatk-wrapper-java-jni libcublas-11-4 libcublas-dev-11-4 libcufft-11-4 libcufft-dev-11-4 libcufile-11-4 libcufile-dev-11-4 libcurand-11-4 libcurand-dev-11-4 libcusolver-11-4 libcusolver-dev-11-4 libcusparse-11-4 libcusparse-dev-11-4
libgif7 libnpp-11-4 libnpp-dev-11-4 libnvidia-cfg1-515 libnvidia-common-515 libnvidia-compute-515 libnvidia-decode-515 libnvidia-encode-515 libnvidia-extra-515 libnvidia-fbc1-515 libnvidia-gl-515 libnvjpeg-11-4 libnvjpeg-dev-11-4
libxxf86dga1 nsight-compute-2021.2.2 nsight-systems-2021.3.2 nvidia-compute-utils-515 nvidia-dkms-515 nvidia-driver-515 nvidia-kernel-common-515 nvidia-kernel-source-515 nvidia-utils-515 openjdk-11-jre openjdk-11-jre-headless x11-utils
xserver-xorg-video-nvidia-515
Suggested packages:
fonts-ipafont-gothic fonts-ipafont-mincho fonts-wqy-microhei | fonts-wqy-zenhei fonts-indic mesa-utils
Recommended packages:
libnvidia-compute-515:i386 libnvidia-decode-515:i386 libnvidia-encode-515:i386 libnvidia-fbc1-515:i386 libnvidia-gl-515:i386
The following packages will be REMOVED:
cuda-drivers-470 libnvidia-cfg1-470 libnvidia-compute-470 libnvidia-decode-470 libnvidia-encode-470 libnvidia-extra-470 libnvidia-fbc1-470 libnvidia-gl-470 libnvidia-ifr1-470 nvidia-compute-utils-470 nvidia-dkms-470 nvidia-driver-470
nvidia-kernel-common-470 nvidia-kernel-source-470 nvidia-utils-470 xserver-xorg-video-nvidia-470
So I’m really not sure what to make of this, other than it seems like if I didn’t have the 11.4.2 repo snapshot, I’d be SOL, because any package updates would trigger the R515 drivers from upstream, and there doesn’t appear to be any clean way back, at least with a package manager.
Edit to include the entire rational for not using the R515 drivers to make it more succinct.
[ 334.731528] NVRM: The NVIDIA Tesla K40m GPU installed in this system is
NVRM: supported through the NVIDIA 470.xx Legacy drivers. Please
NVRM: visit http://www.nvidia.com/object/unix.html for more
NVRM: information. The 515.43.04 NVIDIA driver will ignore
NVRM: this GPU. Continuing probe...
[ 334.731534] NVRM: The NVIDIA Tesla K40m GPU installed in this system is
NVRM: supported through the NVIDIA 470.xx Legacy drivers. Please
NVRM: visit http://www.nvidia.com/object/unix.html for more
NVRM: information. The 515.43.04 NVIDIA driver will ignore
NVRM: this GPU. Continuing probe...
[ 334.731537] NVRM: No NVIDIA GPU found.