Tensorflow serving 1.15 on ampere a100

My requirement is to use tensorflow serving 1.15 on ampere architecture GPUs. For this I followed tutorial given here. But this does not provide any tensorflow serving related documents. I need to use tensorflow serving. I tried making a docker image as follows:

FROM nvcr.io/nvidia/tensorflow:20.06-tf1-py3 as base_build


RUN apt-get update && apt-get install -y --no-install-recommends \
        automake \
        build-essential \
        ca-certificates \
        curl \
        git \
        libfreetype6-dev \
        libtool \
        libcurl3-dev \
        libzmq3-dev \
        mlocate \
        openjdk-8-jdk\
        openjdk-8-jre-headless \
        pkg-config \
        python-dev \
        software-properties-common \
        swig \
        unzip \
        wget \
        zip \
        zlib1g-dev \
        python3-distutils \
        python-distutils-extra \
        && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*

# Install python 3.6.
RUN add-apt-repository ppa:deadsnakes/ppa && \
    apt-get update && apt-get install -y \
    python3.6 python3.6-dev python3-pip python3.6-venv && \
    rm -rf /var/lib/apt/lists/* && \
    python3.6 -m pip install pip --upgrade && \
    update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.6 0

# Make python3.6 the default python version
RUN update-alternatives --install /usr/bin/python python /usr/bin/python3.6 0

RUN curl -fSsL -O https://bootstrap.pypa.io/get-pip.py && \
    python3 get-pip.py && \
    rm get-pip.py

RUN pip3 --no-cache-dir install \
    future>=0.17.1 \
    grpcio \
    h5py \
    mock \
    numpy \
    portpicker \
    requests \
     --ignore-installed six>=1.12.0

# Set up Bazel
ENV BAZEL_VERSION 0.24.1
WORKDIR /
RUN mkdir /bazel && \
    cd /bazel && \
    curl -H "User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36" -fSsL -O https://github.com/bazelbuild/bazel/releases/download/$BAZEL_VERSION/bazel-$BAZEL_VERSION-installer-linux-x86_64.sh && \
    curl -H "User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36" -fSsL -o /bazel/LICENSE.txt https://raw.githubusercontent.com/bazelbuild/bazel/master/LICENSE && \
    chmod +x bazel-*.sh && \
    ./bazel-$BAZEL_VERSION-installer-linux-x86_64.sh && \
    cd / && \
    rm -f /bazel/bazel-$BAZEL_VERSION-installer-linux-x86_64.sh

# Build TensorFlow with the CUDA configuration
ENV CI_BUILD_PYTHON python
ENV LD_LIBRARY_PATH /usr/local/cuda/extras/CUPTI/lib64:/usr/local/cuda/lib64:/usr/local/cuda/lib64/stubs:/usr/include/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu:$LD_LIBRARY_PATH

ENV TF_NEED_CUDA 1
ENV TF_NEED_TENSORRT 0
ENV TENSORRT_INSTALL_PATH=/usr/lib/x86_64-linux-gnu
ENV TF_CUDA_VERSION=
ENV TF_CUDNN_VERSION=

# Fix paths so that CUDNN can be found: https://github.com/tensorflow/tensorflow/issues/8264
WORKDIR /
RUN mkdir /usr/lib/x86_64-linux-gnu/include/
RUN ln -s /usr/include/cudnn.h /usr/local/cuda/include/cudnn.h && \
  ln -s /usr/lib/x86_64-linux-gnu/libcudnn.so /usr/local/cuda/lib64/libcudnn.so && \
  ln -s /usr/lib/x86_64-linux-gnu/libcudnn.so.${TF_CUDNN_VERSION} /usr/local/cuda/lib64/libcudnn.so.${TF_CUDNN_VERSION}

# For backward compatibility we need this line. After 1.13 we can safely remove
# it.
ENV TF_NCCL_VERSION=

# Set TMP for nvidia build environment
ENV TMP="/tmp"

# Download TF Serving sources (optionally at specific commit).
RUN curl -sSL --retry 5 https://github.com/tensorflow/serving/archive/refs/tags/1.15.0.tar.gz | tar --strip-components=1 -xzf -

RUN curl -sSL --retry 5 https://github.com/tensorflow/tensorflow/archive/590d6eef7e91a6a7392c8ffffb7b58f2e0c8bc6b.tar.gz | tar -xzf -

FROM base_build as binary_build
# Build, and install TensorFlow Serving
ARG TF_SERVING_BUILD_OPTIONS="--config=release"
RUN echo "Building with build options: ${TF_SERVING_BUILD_OPTIONS}"
ARG TF_SERVING_BAZEL_OPTIONS=""
RUN echo "Building with Bazel options: ${TF_SERVING_BAZEL_OPTIONS}"

RUN ln -s /usr/local/cuda/lib64/libcusolver.so.*.*.* /usr/local/cuda/lib64/libcusolver.so.${TF_CUDA_VERSION}
RUN ln -s /usr/local/cuda/lib64/stubs/libcuda.so /usr/local/cuda/lib64/stubs/libcuda.so.1 && \
    LD_LIBRARY_PATH=/usr/local/cuda/lib64/stubs:${LD_LIBRARY_PATH} \
    bazel build --color=yes --curses=yes --config=cuda --copt="-fPIC"\
    ${TF_SERVING_BAZEL_OPTIONS} \
    --verbose_failures \
    --output_filter=DONT_MATCH_ANYTHING \
    ${TF_SERVING_BUILD_OPTIONS} \
    tensorflow_serving/model_servers:tensorflow_model_server && \
    cp bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server \
    /usr/local/bin/ && \
    rm /usr/local/cuda/lib64/stubs/libcuda.so.1

# Build and install TensorFlow Serving API
RUN bazel build --color=yes --curses=yes \
    ${TF_SERVING_BAZEL_OPTIONS} \
    --verbose_failures \
    --output_filter=DONT_MATCH_ANYTHING \
    ${TF_SERVING_BUILD_OPTIONS} \
    tensorflow_serving/tools/pip_package:build_pip_package && \
    bazel-bin/tensorflow_serving/tools/pip_package/build_pip_package \
    /tmp/pip && \
    pip --no-cache-dir install --upgrade \
    /tmp/pip/tensorflow_serving_api_gpu-*.whl && \
    rm -rf /tmp/pip

FROM binary_build as clean_build
# Clean up Bazel cache when done.
RUN bazel clean --expunge --color=yes && \
    rm -rf /root/.cache

CMD ["/bin/bash"]

But I’m getting follwoing error:

ERROR: Skipping 'tensorflow_serving/model_servers:tensorflow_model_server': error loading package 'tensorflow_serving/model_servers': in /tensorflow_serving/tensorflow_version.bzl: in /root/.cache/bazel/_bazel_root/6666cd76f96956469e7be39d750cc7d9/external/org_tensorflow/tensorflow/tensorflow.bzl: Encountered error while reading extension file 'cuda/build_defs.bzl': no such package '@local_config_cuda//cuda': Traceback (most recent call last):
	File "/root/.cache/bazel/_bazel_root/6666cd76f96956469e7be39d750cc7d9/external/org_tensorflow/third_party/gpus/cuda_configure.bzl", line 1266
		_create_local_cuda_repository(repository_ctx)
	File "/root/.cache/bazel/_bazel_root/6666cd76f96956469e7be39d750cc7d9/external/org_tensorflow/third_party/gpus/cuda_configure.bzl", line 1033, in _create_local_cuda_repository
		_find_libs(repository_ctx, cuda_config)
	File "/root/.cache/bazel/_bazel_root/6666cd76f96956469e7be39d750cc7d9/external/org_tensorflow/third_party/gpus/cuda_configure.bzl", line 637, in _find_libs
		_find_cuda_lib("cusolver", repository_ctx, cpu_valu..., <2 more arguments>)
	File "/root/.cache/bazel/_bazel_root/6666cd76f96956469e7be39d750cc7d9/external/org_tensorflow/third_party/gpus/cuda_configure.bzl", line 589, in _find_cuda_lib
		find_lib(repository_ctx, [("%s/%s" % (based...))], ...)))
	File "/root/.cache/bazel/_bazel_root/6666cd76f96956469e7be39d750cc7d9/external/org_tensorflow/third_party/gpus/cuda_configure.bzl", line 563, in find_lib
		auto_configure_fail(("None of the libraries match th...)))
	File "/root/.cache/bazel/_bazel_root/6666cd76f96956469e7be39d750cc7d9/external/org_tensorflow/third_party/gpus/cuda_configure.bzl", line 325, in auto_configure_fail
		fail(("\n%sCuda Configuration Error:%...)))

Cuda Configuration Error: None of the libraries match their SONAME: /usr/local/cuda/lib64/libcusolver.so.11
WARNING: Target pattern parsing failed.
ERROR: error loading package 'tensorflow_serving/model_servers': in /tensorflow_serving/tensorflow_version.bzl: in /root/.cache/bazel/_bazel_root/6666cd76f96956469e7be39d750cc7d9/external/org_tensorflow/tensorflow/tensorflow.bzl: Encountered error while reading extension file 'cuda/build_defs.bzl': no such package '@local_config_cuda//cuda': Traceback (most recent call last):
	File "/root/.cache/bazel/_bazel_root/6666cd76f96956469e7be39d750cc7d9/external/org_tensorflow/third_party/gpus/cuda_configure.bzl", line 1266
		_create_local_cuda_repository(repository_ctx)
	File "/root/.cache/bazel/_bazel_root/6666cd76f96956469e7be39d750cc7d9/external/org_tensorflow/third_party/gpus/cuda_configure.bzl", line 1033, in _create_local_cuda_repository
		_find_libs(repository_ctx, cuda_config)
	File "/root/.cache/bazel/_bazel_root/6666cd76f96956469e7be39d750cc7d9/external/org_tensorflow/third_party/gpus/cuda_configure.bzl", line 637, in _find_libs
		_find_cuda_lib("cusolver", repository_ctx, cpu_valu..., <2 more arguments>)
	File "/root/.cache/bazel/_bazel_root/6666cd76f96956469e7be39d750cc7d9/external/org_tensorflow/third_party/gpus/cuda_configure.bzl", line 589, in _find_cuda_lib
		find_lib(repository_ctx, [("%s/%s" % (based...))], ...)))
	File "/root/.cache/bazel/_bazel_root/6666cd76f96956469e7be39d750cc7d9/external/org_tensorflow/third_party/gpus/cuda_configure.bzl", line 563, in find_lib
		auto_configure_fail(("None of the libraries match th...)))
	File "/root/.cache/bazel/_bazel_root/6666cd76f96956469e7be39d750cc7d9/external/org_tensorflow/third_party/gpus/cuda_configure.bzl", line 325, in auto_configure_fail
		fail(("\n%sCuda Configuration Error:%...)))

Cuda Configuration Error: None of the libraries match their SONAME: /usr/local/cuda/lib64/libcusolver.so.11
INFO: Elapsed time: 0.866s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (0 packages loaded)
    currently loading: tensorflow_serving/model_servers

Hi,

Sorry for the delayed response.
Are you still facing this issue.

Thank you.

Yes, I’m still not able to compile tf serving. Do you have a dockerfile for tf serving 1.15 ?

Hi,

Sorry, we do not have an example dockerfile. We recommend you to please refer to the release notes doc and make sure your system satisfies the prerequisites. Also please use the latest container/version to avoid known issues.

Thank you.

It didn’t help. Did anyone was able to compile tensorflow_serving 1.15 on ampere a100 ?