Using the nvidia/cuda container I need to add TensorRt on a Cuda 10.1 host. This worked flawlessly on a on Cuda 10 host. Please help as Docker is a fundamental pillar of our infrastructure.
The following packages have unmet dependencies:
tensorrt : Depends: libnvinfer5 (= 5.0.2-1+cuda10.0) but 5.1.2-1+cuda10.1 is to be installed
Depends: libnvinfer-dev (= 5.0.2-1+cuda10.0) but 5.1.2-1+cuda10.1 is to be installed
Depends: libnvinfer-samples (= 5.0.2-1+cuda10.0) but it is not going to be installed
E: Unable to correct problems, you have held broken packages.
Although you’re trying to install with the “local repo” approach in your Dockerfile, the CUDA containers come pre-configured with the ML/net repo set up, so when you try to apt-get install -y tensorrt, the net repo packages are taking priority because they’re newer (v5.1.2 compared to the local v5.0.2 packages you installed.)
The quick fix for this situation is to make apt not check the ml/net repo packages. You can do this by removing or renaming /etc/apt/sources.list.d/nvidia-ml.list, like so:
# Dockerfile
FROM nvidia/cuda:10.0-cudnn7-devel-ubuntu18.04
ARG TENSORRT=nv-tensorrt-repo-ubuntu1804-cuda10.0-trt5.0.2.6-ga-20181009_1-1_amd64.deb
# From Tensort installation instructions
ARG TENSORRT_KEY=/var/nv-tensorrt-repo-cuda10.0-trt5.0.2.6-ga-20181009/7fa2af80.pub
# custom Tensorrt Installation
ADD $TENSORRT /tmp
# Rename the ML repo to something else so apt doesn't see it
RUN mv /etc/apt/sources.list.d/nvidia-ml.list /etc/apt/sources.list.d/nvidia-ml.list.bkp && \
dpkg -i /tmp/$TENSORRT && \
apt-key add $TENSORRT_KEY && \
apt-get update && \
apt-get install -y tensorrt
OR
The other way to fix this would be to specify exactly what versions of libnvinfer5, libnvinfer-dev, libnvinfer-samples you want so that, even though apt is checking the ml repo, it’s giving you the exact versions you want instead of the latest versions. You can see how to do that here in Section 4.1.1, Item #4:https://docs.nvidia.com/deeplearning/sdk/tensorrt-install-guide/index.html#maclearn-net-repo-install
The renaming of the source is confirmed working. Thanks a lot!
Yet I am still wondering why it falls back to an RC version rather than a GA as well why it picks the libraries for the wrong cuda version (10.1 rather than the 10 installed in the Docker)
I had to do a bit of investigation myself, but the ml/net repo currently keeps up with the most up to date versions as the time, so since it’s enabled in the NGC Cuda container images, it finds those newer versions even though they don’t match exactly.