Unable to run Inference on Tensorflow Hub models that do not support batching with Triton InferServer

user109280 · December 28, 2021, 9:12am

• Hardware Platform (Jetson / GPU) GPU
• DeepStream Version 6.0
• TensorRT Version 8.0.1
• NVIDIA GPU Driver Version (valid for GPU only) 440.33.01

**• Steps to reproduce the issue **

Create a container from the image : ‘nvcr.io/nvidia/deepstream:6.0-triton’ using the below command sudo docker run -it --gpus all -p 8554:8554 -w /opt/nvidia/deepstream/deepstream-6.0
Unzip the attached folder in the current working directory (deepstream-6.0).
Go to efficientdet folder.
Run the following command python3 main.py file:///opt/nvidia/deepstream/deepstream-6.0/efficientdet/data/sample_720p.h264
The screenshot of the error is attached.
Note : We are facing this problem with tensorflow models that do not support batching, whereas with models that support dynamic batching this problem disappears.
efficientdet.zip (45.2 MB)

user109280 · January 5, 2022, 4:05am

Any update regarding this issue?

kayccc · January 11, 2022, 3:10am

Sorry for the late response, we will do the investigation to have the update soon. Thanks

user109280 · January 11, 2022, 3:39am

Ok. Awaiting for your response!

Fiona.Chen · January 19, 2022, 1:52am

Please refer to NVIDIA-AI-IOT/deepstream_tao_apps: Sample apps to demonstrate how to deploy models trained with TAO on DeepStream (github.com)

user109280 · January 19, 2022, 6:12am

Thanks for the update.
But we want to deploy any TensorFlow 2 hub Model ( Not only efficientdet ) using triton-inferserver ( nvinferserver ). We observed Models that don’t support batching cannot be deployed with triton-inference.
We would appreciate any help in this regard. Thanks!

user109280 · January 28, 2022, 6:14am

Any updates ?