Hi,
I am trying to create my own Docker container for Tensorflow on the GPU.
My base is:
FROM nvidia/cuda:10.1-base-ubuntu18.04
LABEL authors=“Lukas Heumos”
description=“Docker image containing all requirements for running machine learning on CUDA enabled GPUs”Install some basic utilities
RUN apt-get update && apt-get install -y
curl
wget
ca-certificates
sudo
git
bzip2
libx11-6
&& rm -rf /var/lib/apt/lists/*Create a working directory and set it as default
RUN mkdir /app
RUN chmod 777 /app
WORKDIR /appCreate a non-root user and switch to it
RUN adduser --disabled-password --gecos ‘’ --shell /bin/bash user
RUN echo “user ALL=(ALL) NOPASSWD:ALL” > /etc/sudoers.d/90-user
USER userAll users can use /home/user as their home directory
ENV HOME=/home/user
RUN chmod 777 /home/userInstall Miniconda
RUN curl -so ~/miniconda.sh https://repo.continuum.io/miniconda/Miniconda3-py37_4.8.2-Linux-x86_64.sh
&& chmod +x ~/miniconda.sh
&& ~/miniconda.sh -b -p ~/miniconda
&& rm ~/miniconda.sh
ENV PATH=/home/user/miniconda/bin:$PATH
ENV CONDA_AUTO_UPDATE_CONDA=falseUpdate Conda first
RUN conda update conda
And my tensorflow container is:
From mlflowcore/base:1.0.0
Install the conda environment
COPY tensorflow_environment.yml .
RUN conda env create -f tensorflow_environment.yml && conda clean -aActivate the environment
RUN echo “source activate tensorflow-2.1-cuda-10.1” > ~/.bashrc
ENV PATH /opt/conda/envs/env/bin:$PATHDump the details of the installed packages to a file for posterity
RUN conda env export --name tensorflow-2.1-cuda-10.1 > tensorflow-2.1-cuda-10.1.yml
with the environment.yml:
name: tensorflow-2.1-cuda-10.1
channels:
- conda-forge
- defaults
dependencies:
- defaults::cudatoolkit=10.1
#- defaults::tensorflow=2.1.0 → distribute.MirroredStrategy API changed in 2.2 → Custom training with tf.distribute.Strategy | TensorFlow Core
- conda-forge::graphviz=2.40.1
- conda-forge::python-graphviz=0.13.2
- pip
- pip:
- tensorflow==2.2.0rc2
However, when trying to run stuff on the GPU I get:
2020-04-02 09:39:54.522822: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-04-02 09:39:54.570821: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-04-02 09:39:54.572960: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-04-02 09:39:54.573491: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-04-02 09:39:54.577211: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-04-02 09:39:54.578817: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-04-02 09:39:54.579250: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library ‘libcudnn.so.7’; dlerror: libcudnn.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64:/usr/local/cuda/lib64
2020-04-02 09:39:54.579268: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1598] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU.
Skipping registering GPU devices…
Why does it not find that file? Where is it?
Help would be highly appreciated.
Thank you very much!
Best