Hello,
I am trying to bootstrap ONNXRuntime with TensorRT Execution Provider and PyTorch inside a docker container to serve some models.
After a ton of digging it looks like that I need to build the onnxruntime wheel myself to enable TensorRT support, so I do something like the following in my Dockerfile
FROM nvcr.io/nvidia/tensorrt:21.03-py3 as onnxruntime
ARG ONNXRUNTIME_REPO=https://github.com/Microsoft/onnxruntime
ARG ONNXRUNTIME_BRANCH=v1.7.2
RUN apt-get update &&\
apt-get install -y sudo git bash unattended-upgrades
RUN unattended-upgrade
RUN python -m pip install --upgrade pip setuptools wheel
WORKDIR /code
ENV PATH /usr/local/nvidia/bin:/usr/local/cuda/bin:/code/cmake-3.14.3-Linux-x86_64/bin:/opt/miniconda/bin:${PATH}
# Prepare onnxruntime repository & build onnxruntime with TensorRT
RUN git clone --single-branch --branch ${ONNXRUNTIME_BRANCH} --recursive ${ONNXRUNTIME_REPO} onnxruntime &&\
/bin/sh onnxruntime/dockerfiles/scripts/install_common_deps.sh &&\
cd onnxruntime &&\
/bin/sh ./build.sh --parallel --cuda_home /usr/local/cuda --cudnn_home /usr/lib/x86_64-linux-gnu/ --use_tensorrt --tensorrt_home /workspace/tensorrt --config Release --build_wheel --update --build --cmake_extra_defines ONNXRUNTIME_VERSION=$(cat ./VERSION_NUMBER)
FROM nvcr.io/nvidia/pytorch:21.03-py3
RUN --mount=type=cache,id=apt-dev,target=/var/cache/apt \
apt-get update && apt-get install -y --no-install-recommends \
build-essential \
ca-certificates \
curl && \
rm -rf /var/lib/apt/lists/* && \
useradd --create-home mrc
USER mrc
ENV PATH="$PATH:/home/mrc/.local/bin" \
TORCH_HOME=/home/mrc/torch_models
COPY ./requirements/requirements.txt /tmp/requirements.txt
COPY --from=onnxruntime /code/onnxruntime/build/Linux/Release/dist/*.whl /tmp
# Upgrade pip, setuptools and wheel
# Install the requirements
# Download the punkt library from nltk.
RUN python -m pip install --upgrade pip setuptools wheel && \
python -m pip install /tmp/*.whl && \
python -m pip install -r /tmp/requirements.txt
# I then expose some ports and start my application with gunicorn
In the above, I essentially do a two stage build, where in the first stage I try to generate the python wheel with the TensorRT Provider, and in the second stage from NVIDIA’s PyTorch container I try to copy it and install inside the container.
However I run into the following issue
#16 4.192 ERROR: onnxruntime_gpu_tensorrt-1.7.2-cp37-cp37m-linux_x86_64.whl is not a supported wheel on this platform.
Both stages start with the same NVIDIA versioned base containers, and contain the same Python, nvcc, OS, etc. Note, that I am using NVIDIA’s 21.03 containers, but the same issue persists on the 20.12 containers as well (which is the version used by the Dockerfile.tensorrt example in the onnxruntime repository)
I also notice that the compiled wheel is for cp37, while the host OS where I compiled the wheel has Python 3.8. Is this difference causing the issue? Any pointers on what I could be doing wrong here?
Thank you!
(I would also appreciate any advice if there is an easier way to accomplish the above, I feel like the multi-stage Docker build is a bit of an overkill for my task…)