Hello,
I am trying to create an nvidia-docker image with installed TensorRT for my specific application. I can’t use any of the provided TensortRT base images, as they are using CUDA version not compatible with the application, but I have a custom TensorRT debian package which is used in my organization. The problem is, when I install it from the Dockerfile, it also installs nvidia drivers. As a result, the container is successfully created, but can’t be started - the result is:
svc_moma_usr@PL1LXD-529389:~/gutkowsp/Docker_projects/test_cuda$ nvidia-docker run tensorrt-test
docker: Error response from daemon: OCI runtime create failed: container_linux.go:346: starting container process caused "process_linux.go:449: container init caused \"process_linux.go:432: running prestart hook 1 caused \\\"error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: mount error: file creation failed: /var/lib/docker/overlay2/97f449ff2535b1ad304520dae75c613931888658a66b89235b0d040a872a625c/merged/usr/bin/nvidia-smi: file exists\\\\n\\\"\"": unknown.
ERRO[0001] error waiting for container: context canceled
The dockerfile is:
FROM nvidia/cuda:9.1-devel-ubuntu16.04
ENV DEBIAN_FRONTEND noninteractive
ENV CUDNN_VERSION 7.0.5.15
LABEL com.nvidia.cudnn.version="${CUDNN_VERSION}"
RUN apt update -y && \
apt install software-properties-common -y && \
apt-add-repository --yes --update ppa:ansible/ansible && \
apt install ansible -y
RUN apt update -y && \
apt install -y --no-install-recommends \
libcudnn7=$CUDNN_VERSION-1+cuda9.1 \
libcudnn7-dev=$CUDNN_VERSION-1+cuda9.1
RUN apt update -y && \
apt install tensorrt -y
How this problem of unnecessary drivers is solved? This seems to me like a common issue, as in general nvidia docker images typically have installed nvidia software, which usually comes with drivers. Maybe someone can share the dockerfiles for the TensorRT images for reference?