Invalid Cross-device link on installing CUDA 11.2 from .deb (Vast.ai Ubuntu 16.04)

Hello,

I am deploying CUDA for a client on their vast.ai GPU servers. They all are running Ubuntu 16.04.

I am following the instructions on the download page to install CUDA:

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-ubuntu1604.pin
mv cuda-ubuntu1604.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/11.2.0/local_installers/cuda-repo-ubuntu1604-11-2-local_11.2.0-460.27.04-1_amd64.deb
dpkg -i cuda-repo-ubuntu1604-11-2-local_11.2.0-460.27.04-1_amd64.deb
apt-key add /var/cuda-repo-ubuntu1604-11-2-local/7fa2af80.pub
apt-get update
apt-get -y install cuda

However when I execute the last step I get the following errors:

Errors were encountered while processing:                                                          
 /tmp/apt-dpkg-install-MFToHc/189-nvidia-460_460.27.04-0ubuntu1_amd64.deb                 
 /tmp/apt-dpkg-install-MFToHc/191-libcuda1-460_460.27.04-0ubuntu1_amd64.deb                        
 /tmp/apt-dpkg-install-MFToHc/193-nvidia-opencl-icd-460_460.27.04-0ubuntu1_amd64.deb      
W: Sources disagree on hashes for supposely identical version '11.2.0-1' of 'cuda-libraries-11-2:amd64'.
W: Sources disagree on hashes for supposely identical version '11.2.0-1' of 'cuda-libraries-11-2:amd64'.                   
W: Sources disagree on hashes for supposely identical version '11.2.0-1' of 'cuda-nsight-systems-11-2:amd64'.                            
W: Sources disagree on hashes for supposely identical version '11.2.0-1' of 'cuda-nsight-systems-11-2:amd64'.
W: Sources disagree on hashes for supposely identical version '11.2.0-1' of 'cuda-tools-11-2:amd64'.
W: Sources disagree on hashes for supposely identical version '11.2.0-1' of 'cuda-tools-11-2:amd64'.
W: Sources disagree on hashes for supposely identical version '11.2.0-1' of 'cuda-11-2:amd64'.
W: Sources disagree on hashes for supposely identical version '11.2.0-1' of 'cuda-11-2:amd64'.
E: Sub-process /usr/bin/dpkg returned an error code (1)

So to fix this I ran apt-get --fix-broken install. However this makes a different set of errors:

dpkg: error processing archive /var/cuda-repo-ubuntu1604-11-2-local/./nvidia-460_460.27.04-0ubuntu1_amd64.deb (--unpack):
trying to overwrite '/usr/lib/x86_64-linux-gnu/libGLX_indirect.so.0', which is also in package libglx-mesa0:amd64 20.0.8-0ubuntu1~18.04.1
Preparing to unpack .../libcuda1-460_460.27.04-0ubuntu1_amd64.deb ...                               
Unpacking libcuda1-460 (460.27.04-0ubuntu1) ...                                               
dpkg: error processing archive /var/cuda-repo-ubuntu1604-11-2-local/.  /libcuda1-460_460.27.04-0ubuntu1_amd64.deb (--unpack):
unable to make backup link of './usr/lib/x86_64-linux-gnu/libcuda.so.460.27.04' before installing new version: Invalid cross-device link
dpkg-deb: error: paste subprocess was killed by signal (Broken pipe)
Preparing to unpack .../nvidia-opencl-icd-460_460.27.04-0ubuntu1_amd64.deb ...
Unpacking nvidia-opencl-icd-460 (460.27.04-0ubuntu1) ...
dpkg: error processing archive /var/cuda-repo-ubuntu1604-11-2-local/./nvidia-opencl-icd-460_460.27.04-0ubuntu1_amd64.deb (--unpack):
unable to make backup link of './usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.460.27.04' before installing new version: Invalid cross-device link
dpkg-deb: error: paste subprocess was killed by signal (Broken pipe)
Errors were encountered while processing:
/var/cuda-repo-ubuntu1604-11-2-local/./nvidia-460_460.27.04-0ubuntu1_amd64.deb
/var/cuda-repo-ubuntu1604-11-2-local/./libcuda1-460_460.27.04-0ubuntu1_amd64.deb
/var/cuda-repo-ubuntu1604-11-2-local/./nvidia-opencl-icd-460_460.27.04-0ubuntu1_amd64.deb
E: Sub-process /usr/bin/dpkg returned an error code (1)

I fixed the first error about nvidia-460_460.27.04-0ubuntu1_amd64.deb by running apt-get -o Dpkg::Options::="--force-overwrite" install --fix-broken as recommended here. But this does not fix the other two errors which still return Invalid cross-device link.

I have isolated the problem to:

root@C.706583:~$ dpkg -i --force-overwrite /var/cuda-repo-ubuntu1604-11-2-local/./libcuda1-460_460.27.04-0ubuntu1_amd64.deb
dpkg: error processing archive /var/cuda-repo-ubuntu1604-11-2-local/./libcuda1-460_460.27.04-0ubuntu1_amd64.deb (--install):
unable to make backup link of './usr/lib/x86_64-linux-gnu/libcuda.so.460.27.04' before installing new version: Invalid cross-device link

Apparently on all of the Vast.ai servers, there’s a second partition that’s mounted at /usr/bin/nvidia-smi, even though it’s a regular file.

root@C.706582:~$ df -h
Filesystem      Size  Used Avail Use% Mounted on
overlay         331G  136K  331G   1% /
tmpfs            64M     0   64M   0% /dev
tmpfs            63G     0   63G   0% /sys/fs/cgroup
shm             8.0G     0  8.0G   0% /dev/shm
/dev/sda4       3.7T  206G  3.5T   6% /etc/hosts
tmpfs            63G   12K   63G   1% /proc/driver/nvidia
/dev/sda2        21G   12G  8.0G  59% /usr/bin/nvidia-smi
udev             63G     0   63G   0% /dev/nvidia0
tmpfs            63G     0   63G   0% /proc/asound
tmpfs            63G     0   63G   0% /proc/acpi
tmpfs            63G     0   63G   0% /proc/scsi
tmpfs            63G     0   63G   0% /sys/firmware

To my understanding, invalid cross-device link errors happen when you try to hard link a file from one filesystem to another.

Is there any way around this issue to successfully install CUDA Toolkit 11.2?

3 Likes