Hi, so I am running into an issue where switching my base Docker image results in slower speeds (specifically for inference using deep learning models).
Previously I was using a base image of nvidia/cuda:12.2.0-runtime-ubuntu20.04
with the following Dockerfile:
FROM nvidia/cuda:12.2.0-runtime-ubuntu20.04
# Keeps Python from generating .pyc files in the container
ENV PYTHONDONTWRITEBYTECODE=1
EXPOSE 8080
# The directory is created by root. This sets permissions so that any user can
# access the folder.
RUN mkdir -m 777 -p /usr/app /home
WORKDIR /usr/app
ENV HOME=/home
# Install python 3.9
# (installing 3.10 like this added 2GB to the image size)
RUN apt-get update && \
DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \
python3.9 python3.9-distutils python3.9-dev curl build-essential
RUN curl -sSL https://bootstrap.pypa.io/get-pip.py -o get-pip.py
RUN python3.9 get-pip.py
COPY requirements.txt requirements.txt
RUN pip install -r requirements.txt
However, I recently switched to using a base image of nvcr.io/nvidia/pytorch:23.08-py3
with a Dockerfile that looks like:
FROM nvcr.io/nvidia/pytorch:23.08-py3
# Keeps Python from generating .pyc files in the container
ENV PYTHONDONTWRITEBYTECODE=1
EXPOSE 8080
# The directory is created by root. This sets permissions so that any user can
# access the folder.
RUN mkdir -m 777 -p /usr/app /home
WORKDIR /usr/app
ENV HOME=/home
COPY requirements.txt requirements.txt
# Removing torch and nvidia packages from the requirements so they don't conflict
# with what's in the base image.
RUN sed -i '/torch==/d' requirements.txt
RUN pip install -r requirements.txt --no-cache-dir
ENV PYTHONPATH "${PYTHONPATH}:/usr/app/"
LD_LIBRARY_PATH=/usr/local/cuda/compat/lib.real:/usr/local/lib/python3.10/dist-packages/torch/lib:/usr/local/lib/python3.10/dist-packages/torch_tensorrt/lib:/usr/local/cuda/compat/lib:/usr/local/nvidia/lib:/usr/local/nvidia/lib64
ENV PATH=/usr/local/nvm/versions/node/v16.20.0/bin:/usr/local/lib/python3.10/dist-packages/torch_tensorrt/bin:/usr/local/mpi/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/ucx/bin:/opt/tensorrt/bin
I added the environment variables LD_LIBRARY_PATH
and PATH
based off of this StackOverflow post because I was getting the same warning described.
Setting those variables allows me to utilize CUDA, however, speeds are faster than without CUDA, but are still ~3x slower than they were with the previous image.
This is all running on a NVIDIA_TESLA_T4
GPU.
Does anyone have an idea as to what could be causing these slow downs? If you need any additional information, I’m happy to provide it.