Seeking Guidance on Building DeepStream Image with Triton Inference Server and TensorFlow GPU Support

ajithkumar.ak95 · March 12, 2024, 10:42am

• GPU (Jetson / GPU)
• DeepStream Version:6.3
• JetPack Version (valid for Jetson only)
• TensorRT Version
• NVIDIA GPU Driver Version (535.86.10)
• Issue Type( questions)

I am currently working on building an image that combines DeepStream with Triton Inference Server, utilizing TensorFlow with GPU support. I am encountering some challenges and would appreciate your expertise in resolving them.

Here is the base image I’m using:nvcr.io/nvidia/tritonserver:24.02-tf2-python-py3

In my setup, I am employing a Python backend in Triton Inference Server. When specific preprocessing tasks using TensorFlow are required, I find it necessary to install TensorFlow on the existing Python3 inside the container. Unfortunately, I am using TensorFlow in various other components, making the conda pack solution unsuitable for my needs.

Installing TensorFlow using the command pip install tensorflow-gpu does not enable GPU utilization, and attempting to use tftrt also poses challenges.

I kindly request assistance in addressing the following issues:

How can I successfully install TensorFlow with GPU support within the existing Python3 environment in the specified Triton Inference Server container?
Are there any specific configurations or steps required to ensure proper GPU utilization during the installation of TensorFlow?
Are there alternative approaches or best practices for integrating TensorFlow GPU support within Triton Inference Server?

Your insights and guidance on these matters would be immensely valuable to my project. Thank you in advance for your time and assistance.

fanzh · March 13, 2024, 6:59am

please find deepstream:6.4-triton-multiarch in this link, this image includes DeepStream and trtion.

ajithkumar.ak95 · March 14, 2024, 2:21am

Thanks, I have tried it but it doesn’t have TensorFlow,

>>> import tensorflow as tf
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'tensorflow'
>>>

This is the error that I am getting.

How can I successfully install TensorFlow with GPU support within the existing Python3 environment in the specified Triton Inference Server container?
Are there any specific configurations or steps required to ensure proper GPU utilization during the installation of TensorFlow?
Are there alternative approaches or best practices for integrating TensorFlow GPU support within Triton Inference Server?

fanzh · March 17, 2024, 2:54pm

These questions would outside of DeepStream. triton is opensource. please refer to this link tensorflow_backend for tensorflow issue.

ajithkumar.ak95 · March 18, 2024, 8:25am

Hi @fanzh , Please note I dont need tensorflow backend, I need python backend for triton, with tf support. This is for running operations like GPU based tensorflow convolutions on python backend.

I am looking for a Dockerfile that have deepstream and top of that following layers, which will build tensorflow on top of it.

## Download & Building TensorFlow from source in same RUN
ARG LATEST_BAZELISK=1.5.0
ARG LATEST_BAZEL=3.7.2
ARG CTO_TENSORFLOW_VERSION=2.7.0
ARG PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python
ARG CTO_TF_OPT=""
ARG CTO_DNN_ARCH=""
ARG TF_NEED_CLANG=0
RUN touch /tmp/.GPU_build
COPY tools/tf_build.sh /tmp/
RUN curl -s -Lo /usr/local/bin/bazel https://github.com/bazelbuild/bazelisk/releases/download/v${LATEST_BAZELISK}/bazelisk-linux-amd64
RUN chmod +x /usr/local/bin/bazel
RUN mkdir -p /usr/local/src/tensorflow
WORKDIR /usr/local/src
RUN wget -q --no-check-certificate -c https://github.com/tensorflow/tensorflow/archive/v${CTO_TENSORFLOW_VERSION}.tar.gz -O - | tar --strip-components=1 -xz -C /usr/local/src/tensorflow
WORKDIR /usr/local/src/tensorflow
RUN fgrep _TF_MAX_BAZEL configure.py | grep '=' | perl -ne '$lb="'${LATEST_BAZEL}'";$brv=$1 if (m%\=\s+.([\d\.]+).$+%); sub numit{@g=split(m%\.%,$_[0]);return(1000000*$g[0]+1000*$g[1]+$g[2]);}; if (&numit($brv) > &numit($lb)) { print "$lb" } else {print "$brv"};' > .bazelversion
RUN bazel clean
RUN chmod +x /tmp/tf_build.sh
RUN time /tmp/tf_build.sh Av1
RUN time ./bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
RUN time pip3 install /tmp/tensorflow_pkg/tensorflow-*.whl

where tf_build.sh is

#!/bin/bash
set -e

# Preliminary list of variables found at https://gist.github.com/PatWie/0c915d5be59a518f934392219ca65c3d
# Addtional (from 2.2):
# 'build' options: --apple_platform_type=macos --define framework_shared_object=true --define open_source_build=true --java_toolchain=//third_party/toolchains/java:tf_java_toolchain --host_java_toolchain=//third_party/toolchains/java:tf_java_toolchain --define=use_fast_cpp_protos=true --define=allow_oversize_protos=true --spawn_strategy=standalone -c opt --announce_rc --define=grpc_no_ares=true --noincompatible_remove_legacy_whole_archive --noincompatible_prohibit_aapt1 --enable_platform_specific_config --config=v2
#INFO: Reading rc options for 'build' from /usr/local/src/tensorflow/.tf_configure.bazelrc:
#  'build' options: --action_env PYTHON_BIN_PATH=/usr/local/bin/python --action_env PYTHON_LIB_PATH=/usr/local/lib/python3.6/dist-packages --python_path=/usr/local/bin/python --action_env TF_CUDA_VERSION=10.1 --action_env TF_CUDNN_VERSION=7 --action_env CUDA_TOOLKIT_PATH=/usr/local/cuda-10.1 --action_env TF_CUDA_COMPUTE_CAPABILITIES=3.5,7.0 --action_env LD_LIBRARY_PATH=/usr/local/nvidia/lib:/usr/local/nvidia/lib64:/usr/local/cuda/extras/CUPTI/lib64 --action_env GCC_HOST_COMPILER_PATH=/usr/bin/x86_64-linux-gnu-gcc-7 --config=cuda --action_env TF_CONFIGURE_IOS=0

cd /usr/local/src/tensorflow
# Args: Tensorflow config build extra (ex:v1)

# Default config is to have it "CPU"
config_add="--config=opt"

if [ "A$1" == "A" ]; then 
  echo "Usage: $0 TF_config"
  echo "  TF_config: 1x extra argument to pass to --config= (ex: v1)"
  exit 1
fi

if [ "A$2" != "A" ]; then
  config_add="$config_add --config=$2"
fi

echo "--- Tensorflow Build --- " > /tmp/tf_env.dump
export TF_NEED_CUDA=0
cudnn=0
if [ -f /tmp/.GPU_build ]; then
  cudnn=1
  echo "** CUDNN requested" | tee -a /tmp/tf_env.dump
  # CuDNN8
  cudnn_inc="/usr/include/cudnn.h"
  cudnn8_inc="/usr/include/x86_64-linux-gnu/cudnn_version_v8.h"
  if [ -f $cudnn8_inc ]; then
    cudnn_inc="${cudnn8_inc}"
  fi
  if [ ! -f $cudnn_inc ]; then
    echo "** Unable to find $cudnn_inc, will not be able to compile requested GPU build, aborting" | tee -a /tmp/tf_env.dump
    exit 1
  fi
fi

if [ "A$2" == "Av1" ]; then
  export _TMP=`python -c 'import sys; version=sys.version_info[:3]; print("{0}.{1}".format(*version))'`
  if [ "A${_TMP}" == "A3.8" ]; then
    echo "[**] Patching TF1 & Python 3.8 issue"
    # https://github.com/tensorflow/tensorflow/issues/34197
    # https://github.com/tensorflow/tensorflow/issues/33543
    perl -pi.bak -e 's%nullptr(,\s+/.\s+tp_print)%NULL$1%' tensorflow/python/lib/core/ndarray_tensor_bridge.cc tensorflow/python/lib/core/bfloat16.cc tensorflow/python/eager/pywrap_tfe_src.cc 
    diff -u tensorflow/python/lib/core/ndarray_tensor_bridge.cc{.bak,} || true
    diff -u tensorflow/python/lib/core/bfloat16.cc{.bak,} || true
    diff -u tensorflow/python/eager/pywrap_tfe_src.cc{.bak,} || true
  fi
fi

if [ "A$cudnn" == "A1" ]; then
  export TF_CUDNN_VERSION="$(sed -n 's/^#define CUDNN_MAJOR\s*\(.*\).*/\1/p' $cudnn_inc)"
  if [ "A$TF_CUDNN_VERSION" == "A" ]; then
    cudnn=0
    echo "** Problem finding DNN major version, aborting" | tee -a /tmp/tf_env.dump
    exit 1
  else
    export TF_NEED_CUDA=1
    export TF_NEED_TENSORRT=0
    export TF_CUDA_VERSION="$(nvcc --version | sed -n 's/^.*release \(.*\),.*/\1/p')"
    export TF_CUDA_COMPUTE_CAPABILITIES="${CTO_DNN_ARCH}"

    nccl_inc="/usr/local/cuda/include/nccl.h"
    nccl2_inc='/usr/include/nccl.h'
    if [ -f $nccl2_inc ]; then
      nccl_inc="${nccl2_inc}"
    fi
    if [ -f $nccl_inc ]; then
      export TF_NCCL_VERSION="$(sed -n 's/^#define NCCL_MAJOR\s*\(.*\).*/\1/p' $nccl_inc)"
    fi

    # cudnn build: TF 1.15.[345] with CUDA 10.2 fix -- see https://github.com/tensorflow/tensorflow/issues/34429
    if [ "A${TF_CUDA_VERSION=}" == "A10.2" ]; then
      if grep VERSION /usr/local/src/tensorflow/tensorflow/tensorflow.bzl | grep -q '1.15.[345]' ; then
        echo "[**] Patching third_party/nccl/build_defs.bzl.tpl"
        perl -pi.bak -e 's/("--bin2c-path=%s")/## 1.15.x compilation ## $1/' third_party/nccl/build_defs.bzl.tpl
        diff -u third_party/nccl/build_defs.bzl.tpl{.bak,} || true
      fi
    fi
    
    # cudnn build: TF 2.5.[01] with CUDA 10.2 fix
    if [ "A${TF_CUDA_VERSION=}" == "A10.2" ]; then
      if grep VERSION /usr/local/src/tensorflow/tensorflow/tensorflow.bzl | grep -q '2.5.[01]' ; then
        # https://github.com/tensorflow/tensorflow/pull/48393    
        echo "[**] Patching third_party/cub.BUILD"
        perl -pi.bak -e 's%\@local_cuda//%\@local_config_cuda//cuda%' third_party/cub.BUILD
        diff -u third_party/cub.BUILD{.bak,} || true
        # https://github.com/tensorflow/tensorflow/issues/48468#issuecomment-819251527
        echo "[**] Patching third_party/absl/workspace.bzl"
        perl -pi.bak -e 's%(patch_file = "//third_party/absl:com_google_absl_fix_mac_and_nvcc_build.patch)%#$1%' third_party/absl/workspace.bzl
        diff -u third_party/absl/workspace.bzl{.bak,} || true
        # https://github.com/tensorflow/tensorflow/issues/48468#issuecomment-819378549
        echo "[**] Patching tensorflow/core/platform/default/cord.h"
        perl -pi.bak -e 's%^(\#include "absl/strings/cord.h")%#if \!defined\(__CUDACC__\)\n$1%;s%^(\#define TF_CORD_SUPPORT 1)%$1\n\#endif%' tensorflow/core/platform/default/cord.h
        diff -u tensorflow/core/platform/default/cord.h{.bak,} || true
      fi
    fi

  fi

  config_add="$config_add --config=cuda"

else
# here: NOT CUDNN
  if grep VERSION /usr/local/src/tensorflow/tensorflow/tensorflow.bzl | grep -q '1.15.5' ; then
#    echo "[**] Patching third_party/nccl/build_defs.bzl.tpl"
#    perl -pi.bak -e 's/("--bin2c-path=%s")/## 1.15.x compilation ## $1/' third_party/nccl/build_defs.bzl.tpl
    # https://github.com/tensorflow/tensorflow/issues/33758#issuecomment-547867642
    echo "[**] Patching grpc dependencies"
    curl -s -L https://github.com/tensorflow/tensorflow/compare/master...hi-ogawa:grpc-backport-pr-18950.patch | patch -p1
  fi

fi

export GCC_HOST_COMPILER_PATH=$(which gcc)
#export CC_OPT_FLAGS="-march=native"
export CC_OPT_FLAGS=""
export PYTHON_BIN_PATH=$(which python)
export PYTHON_LIB_PATH="$(python -c 'import site; print(site.getsitepackages()[0])')"

export TF_CUDA_CLANG=0

export TF_DOWNLOAD_CLANG=0
export TF_DOWNLOAD_MKL=0

export TF_ENABLE_XLA=0

export TF_NEED_AWS=0
export TF_NEED_COMPUTECPP=0
export TF_NEED_GCP=0
export TF_NEED_GDR=0
export TF_NEED_HDFS=0
export TF_NEED_JEMALLOC=1
export TF_NEED_KAFKA=0
export TF_NEED_MKL=0
export TF_NEED_MPI=0
export TF_NEED_OPENCL=0
export TF_NEED_OPENCL_SYCL=0
export TF_NEED_ROCM=0
export TF_NEED_S3=0
export TF_NEED_VERBS=0

export TF_SET_ANDROID_WORKSPACE=0

##

echo "-- Environment variables set:"  | tee -a /tmp/tf_env.dump
env | grep TF_ | grep -v CTO_ | sort | tee -a /tmp/tf_env.dump
for i in GCC_HOST_COMPILER_PATH CC_OPT_FLAGS PYTHON_BIN_PATH PYTHON_LIB_PATH; do
  echo "$i="`printenv $i` | tee -a /tmp/tf_env.dump
done

echo "-- ./configure output:" | tee -a /tmp/tf_env.dump
./configure | tee -a /tmp/tf_env.dump 

start_time=$SECONDS
echo "-- bazel command to run:" | tee -a /tmp/tf_env.dump
build_cmd="bazel build --verbose_failures $config_add //tensorflow/tools/pip_package:build_pip_package"
echo  $build_cmd| tee -a /tmp/tf_env.dump 
$build_cmd
end_time=$SECONDS
elapsed=$(( end_time - start_time ))
echo "-- TensorFlow building time (in seconds): $elapsed" | tee -a /tmp/tf_env.dump

Here the problem is after building the tensorflow from source the GPU is not being used by the library. Please let me know if this is correct way.

fanzh · March 19, 2024, 6:48am

please refer to these triton links python_backend , tensorflow for python backend for triton, with tf support.

ajithkumar.ak95 · March 23, 2024, 3:29pm

I was able to install tensorflow with GPU in python backend triton inference server with following layers on the Dockerfile

RUN mkdir /tf_temp

COPY tensorflow-2.10.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl /tf_temp
COPY libcudnn8_8.1.0.77-1+cuda11.2_amd64.deb /tf_temp
# WORKDIR /tf_temp
RUN apt-get install -y --allow-downgrades /tf_temp/libcudnn8_8.1.0.77-1+cuda11.2_amd64.deb
RUN python3 -m pip install /tf_temp/tensorflow-2.10.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
WORKDIR /usr/local/cuda-12.1/targets/x86_64-linux/lib
RUN ln -s libcufft.so libcufft.so.10
RUN ln -s libcusparse.so libcusparse.so.11
WORKDIR /usr/lib/x86_64-linux-gnu

RUN ln -s libnvinfer.so.8 libnvinfer.so.7
RUN ln -s libnvinfer_plugin.so.8 libnvinfer_plugin.so.7

system · April 6, 2024, 3:30pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
TF-TRT issue Jetson TX2	26	3831	October 18, 2021
Is Tensorflow 2.0 on Jetson TX2 supported? Jetson TX2	19	4499	October 18, 2021
No GPU availability through tensorflow Jetson AGX Orin cuda , tensorflow	18	3065	December 21, 2022
TensorFlow for Jetson TX2! Jetson TX2	113	47639	September 21, 2023
Building custom TensorFlow ops DeepStream SDK	8	462	March 6, 2023
TensorFlow Issue - 'NonMaxSuppressionV3' in binary Jetson TX2	16	3150	October 18, 2021
DeepStream, Tensorflow Model Zoo - Incompatibility DeepStream SDK	13	1498	October 12, 2021
Trying to run TensorFlow 1.15 produced graphdefs with TF2 based tensorRT but TensorRT model is not building correctly TensorRT	6	991	July 15, 2021
Custom Detection parser error with nvinferserver and custom python model with > 1 streams DeepStream SDK inference-server-triton , gpu , deepstream	18	1106	September 4, 2023
Error when using Triton Server for Inference on deepstream-imagedata-example DeepStream SDK	21	1819	October 12, 2021

Seeking Guidance on Building DeepStream Image with Triton Inference Server and TensorFlow GPU Support

Related topics