How to get tensorflow_model_server on Jetson Xavier

Hello,

I am currently trying to get Tensorflow serving running on a Jetson AGX Xavier. Is there any alternative to building it from source ? The official build from source does not work because it requires a docker and the Jetson arch is not supported, so I have to build bazel, etc, which is long and experimental.

If build from source is the only alternative, what is the best way to make sure it is using the same version as the one from the current Tensorflow python wheel that can currently be installed ?

Are there any similar alternatives to TF serving that are officially supported (so a relatively light Python API to run tensorflow inferences on the Xavier) ?

Thanks in advance

Hi,

We have an official TensorFlow package for Xavier:
https://devtalk.nvidia.com/default/topic/1042125/jetson-agx-xavier/official-tensorflow-for-jetson-agx-xavier/

But I’m not sure if there is any special option need to be enabled for the TensorFlow server.
It’s recommended to give the package a try.
If anything missing, you can also build it from source with this example:
https://github.com/peterlee0127/tensorflow-nvJetson

Thanks.

Hi,

Thanks for the answer, but the package concerns the Python TF package, not the TF model server (this one https://www.tensorflow.org/tfx/guide/serving). We have successfully used the Tensorflow python package, but the server is a separate binary. The official from source build is through docker (https://www.tensorflow.org/tfx/serving/setup, but docker images are not available on the Jetson Xavier as far as I know). I am currently trying to compile it without docker, but for Tensorflow 1.13 I am encountering issues.

Regards

Emilie

Hi,

Sorry that we don’t have too much experience on this serving system.

It’s recommended to check if TensorFlow serving supports ARM based platform first.
We have some user to build it on Jetson but fail with AWS related issue.

Thanks.

Here is the previous topic:
https://devtalk.nvidia.com/default/topic/1031642/jetson-tx2/tensorflow-serving-on-jetson-tx2/

Maybe you can have some information from them.
Thanks.

@emilie.wirbel: currently building from source is the only option for TensorFlow serving on aarch64 (ARM), yes.

You might want to have a look at https://github.com/helmuthva/jetson to get you started with TensorFlow Serving running on a Xavier inside a Docker container. In this particular case the Xavier is integrated in a multi-arch Kubernetes Cluster.

https://github.com/helmuthva/jetson/blob/master/workflow/deploy/tensorflow-serving-base/src/Dockerfile is the base Docker image definition for TensorFlow Serving (with Bazel built in-place).

https://github.com/helmuthva/jetson/blob/master/workflow/deploy/tensorflow-serving/src/Dockerfile is a child image definition to access TensorFlow Serving via a simple webservice.

Building the images takes ca. 2 hours.

Basic knowledge of Ansible and Google Skaffold is assumed to understand the build infrastructure used here.

@emilie.wirbel: in case you don’t want to build the images yourself they are now published on DockerHub - see https://hub.docker.com/r/helmuthva/jetson-xavier-tensorflow-serving and https://hub.docker.com/u/helmuthva.

To allow access to the GPU from inside the Docker container you need to mount the following devices

  • /dev/nvhost-ctrl
  • /dev/nvhost-ctrl-gpu
  • /dev/nvhost-prof-gpu
  • /dev/nvmap
  • /dev/nvhost-gpu
  • /dev/nvhost-as-gpu

With docker run this can be easily achieved like so: docker run --device=/dev/nvhost-ctrl --device=/dev/…

See https://github.com/helmuthva/jetson/blob/master/workflow/deploy/tensorflow-serving/kustomize/base/deployment.yaml on how this is done in a Kubernetes deployment.

@HelmutHofferVonAnker Thanks a lot for the reply ! We found your docker image and tried it, we also built from source (master branch of tensorflow-serving), getting inspiration from your dockerfile and it works now.

As a side note, now we have trouble when trying to serve models which were optimized by TensorRT, which is another issue.

@emilie.wirbel: great.

Regarding TensorRT - I will follow up with an end-to-end example showing how to get a pipeline (training -> optimize for TensorRt -> deploy to shared storage (e.g. using minio) -> serving for inference) soonish given spare time.

Best

@HelmutHofferVonAnker
Just checking if it is possible to load multiple models using this flag --model_config_file=

That works in TF Serving docker

I tried it using your docker and it doesn’t work