Description
I’m working with tensorflow2, TensorRT 8, trying to convert a model from TF to TRT.
Later, I want to take the optimized model and run it on a Jetson Nano + DeepStream + Triton.
I’ve followed this code, which works ok for the conversion: 2.2.4. TF-TRT 2.0 Workflow With A SavedModel. The optimized model is still stored as model.savedmodel. I’m not saving the engines because this process is being run on a workstation and later I will use the model on a Nano.
Here is my problem, I want to serialize the trt-graph, but I can’t find the correct way.
This code snippet doesn’t fit my case: How To Generate A Stand-Alone TensorRT Plan
Note from doc: The original Python function create_inference_graph that was used in TensorFlow 1.13 and earlier is deprecated in TensorFlow >1.13 and removed in TensorFlow 2.0.
Having as target a Jetson Nano + TritonServer:
- Is it a good idea to extract the trt-graph from the optimized model.savedmodel?
- If it is, I gently ask for help with the code snippet.
- Else, what is the correct way of running tf-trt on Triton? I mean parameters for the server.
Can you guide me on this? Thanks!
Environment
TensorRT Version:
8.0.1.6
GPU Type:
dGPU and Jetson
Nvidia Driver Version:
495.29.05
CUDA Version:
11.1
CUDNN Version:
8.2.2.26
Operating System + Version:
Ubuntu 20.04
Python Version (if applicable):
3.8
TensorFlow Version (if applicable):
2.5
Baremetal or Container (if container which image + tag):
tensorflow:21.08-tf2-py3