Cannot Convert Custom Model To TensorRT

sorozco0612 · April 19, 2021, 7:36pm

Description

I am currently trying to run a trained model in a runtime system. I am inputting a video feed into my model, but the model currently only computes about 1-2 Frames Per Second. After some research, I read that TensorRT can help with this, so I am currently trying to convert my model to TensorRT. I copied the following code from here Accelerating Inference In TF-TRT User Guide :: NVIDIA Deep Learning Frameworks Documentation.

The code and error are as follows:

Code

import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.python.compiler.tensorrt import trt_convert as trt

conversion_params = trt.DEFAULT_TRT_CONVERSION_PARAMS
conversion_params = conversion_params._replace(max_workspace_size_bytes=(1<<32))
conversion_params = conversion_params._replace(precision_mode="FP16")
conversion_params = conversion_params._replace(maximum_cached_engines=100)
converter = trt.TrtGraphConverterV2(input_saved_model_dir='/home/sorozco/alan_bot_model',conversion_params = conversion_params)
converter.convert()
converter.build(input_fn=my_input_fn)
converter.save(output_model_dir='/home/sorozco/alan_bot_model_trt')

def my_input_fn():
    # Input for a single inference call, for a network that has two input tensors:
    Inp1 = np.random.normal(size=(8, 16, 16, 3)).astype(np.float32)
    inp2 = np.random.normal(size=(8, 16, 16, 3)).astype(np.float32)
    yield (inp1, inp2)

model = tf.saved_model.load('/home/sorozco/alan_bot_model_trt/', tags=[tag_constants.SERVING])

Error

2021-04-19 15:09:18.670488: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] 
Successfully opened dynamic library libcudart.so.10.2
2021-04-19 15:09:28.223133: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] 
Successfully opened dynamic library libnvinfer.so.7
2021-04-19 15:09:29.176650: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] 
Successfully opened dynamic library libcuda.so.1
2021-04-19 15:09:29.217699: E tensorflow/stream_executor/cuda/cuda_driver.cc:328] failed call to 
cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
2021-04-19 15:09:29.217798: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (sorozco-desktop): /proc/driver/nvidia/version does not exist
2021-04-19 15:09:29.248773: W tensorflow/core/platform/profile_utils/cpu_utils.cc:116] Failed to find 
bogomips or clock in /proc/cpuinfo; cannot determine CPU frequency
2021-04-19 15:09:29.249723: I tensorflow/compiler/xla/service/service.cc:168] XLA service 
0x40da2780 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2021-04-19 15:09:29.249795: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor 
device (0): Host, Default Version
2021-04-19 15:09:44.597539: I tensorflow/core/grappler/devices.cc:69] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 0
2021-04-19 15:09:44.597798: I tensorflow/core/grappler/clusters/single_machine.cc:356] Starting new session
2021-04-19 15:09:44.662813: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:954] 
Optimization results for grappler item: graph_to_optimize
  function_optimizer: Graph size after: 45 nodes (34), 60 edges (49), time = 4.423ms.
  function_optimizer: function_optimizer did nothing. time = 0.119ms.

2021-04-19 15:09:55.306136: I tensorflow/core/grappler/devices.cc:69] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 0
2021-04-19 15:09:55.391003: I tensorflow/core/grappler/clusters/single_machine.cc:356] Starting new session
Killed

I’m particular nervous about the following line:

2021-04-19 15:09:44.597539: I tensorflow/core/grappler/devices.cc:69] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 0

Does this mean that I am not utilizing the GPU in my Jetson Nano? I would also like to know how to convert my custom model to TensorRT in general. This is somewhat foreign territory to me, and most tutorials are not very helpful.

Environment

TensorRT Version:

dpkg -l | grep nvinfer
ii  libnvinfer-bin                             7.1.3-1+cuda10.2                                 arm64        TensorRT binaries

GPU Type:

I'm not sure how to check this.

Nvidia Driver Version:

I'm not sure how to check this.

CUDA Version:

cat /usr/local/cuda/version.txt
CUDA Version 10.2.89

CUDNN Version:

cat /usr/include/cudnn_version.h | grep CUDNN_MAJOR -A 2

#define CUDNN_MAJOR 8
#define CUDNN_MINOR 0
#define CUDNN_PATCHLEVEL 0
--
#define CUDNN_VERSION (CUDNN_MAJOR * 1000 + CUDNN_MINOR * 100 + CUDNN_PATCHLEVEL)

#endif /* CUDNN_VERSION_H */

Operating System + Version:

cat /etc/os-release
NAME="Ubuntu"
VERSION="18.04.5 LTS (Bionic Beaver)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 18.04.5 LTS"
VERSION_ID="18.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=bionic
UBUNTU_CODENAME=bionic

Python Version (if applicable): Python 3.6.9
TensorFlow Version (if applicable): 2.4.0
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag):

NVES · April 19, 2021, 7:37pm

Hi,
This looks like a Jetson issue. We recommend you to raise it to the respective platform from the below link

Thanks!

sorozco0612 · April 19, 2021, 7:42pm

I would disagree. Although the code is run on a Jetson Nano, the error pertains to deep learning and TensorRT. I can omit Jetson Nano from the title.

NVES · April 19, 2021, 8:07pm

Hi,
Request you to share the ONNX model and the script if not shared already so that we can assist you better.
Alongside you can try few things:

validating your model with the below snippet

check_model.py

import sys
import onnx
filename = yourONNXmodel
model = onnx.load(filename)
onnx.checker.check_model(model).
2) Try running your model with trtexec command.
https://github.com/NVIDIA/TensorRT/tree/master/samples/opensource/trtexec
In case you are still facing issue, request you to share the trtexec “”–verbose"" log for further debugging
Thanks!

sorozco0612 · April 19, 2021, 8:15pm

Thank you for the help!

I am confused by what you mean “ONNX”. My tensorflow model is a .pb file. The code I used to train my model can be found here: alan_bot/autonomous_driving_model.ipynb at model_creation · FezTheImmigrant/alan_bot · GitHub

Am I missing some conversion step to ONNX?

spolisetty · April 20, 2021, 1:19pm

Hi @sorozco0612,

Sorry for the confusion. Could you please share with us tensorflow model (.pb) and conversion script. We would like to reproduce the error from our end for better assistance.

Thank you.

sorozco0612 · April 20, 2021, 3:35pm

Of Course!
saved_model.pb (133.0 KB) conversion.py (1.2 KB)

sorozco0612 · April 20, 2021, 4:06pm

Apologies, I uploaded the wrong conversion file. conversion.py (1.0 KB)

sorozco0612 · April 20, 2021, 5:49pm

So I have made some progress. I decided to switch the conversion process over to my development machine. I altered the conversion code to:

def my_input_fn():
   inp1 = np.random.normal(size=(1, 432, 614, 3)).astype(np.float32)
   inp2 = np.random.normal(size=(1, 432, 614, 3)).astype(np.float32)
   yield (inp1,)


conversion_params = trt.DEFAULT_TRT_CONVERSION_PARAMS
conversion_params = conversion_params._replace(max_workspace_size_bytes=(1 << 32))
conversion_params = conversion_params._replace(precision_mode="FP16")
conversion_params = conversion_params._replace(maximum_cached_engines=100)
converter = trt.TrtGraphConverterV2( input_saved_model_dir="/home/sorozco0612/dev/alan_bot/model_creation/alan_bot_model", conversion_params=conversion_params,)

converter.convert()
converter.build(input_fn=my_input_fn)
converter.save("/home/sorozco0612/alan_bot_model_trt")

This actually creates the trt model, which is good I assume.

This is the code I am using to run inference on the RT model now:

model = tf.saved_model.load(
"/home/sorozco0612/alan_bot_model_trt/",
tags=[tf.compat.v1.saved_model.tag_constants.SERVING],)

graph_func = model.signatures[
tf.compat.v1.saved_model.signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY]

frozen_func = trt.convert_to_constants.convert_variables_to_constants_v2(graph_func)

output = frozen_func(np.zeros((432, 614, 3), np.uint8))[0].numpy()

I am just trying to infer with an empty image created using numpy, but I get the following error:

Traceback (most recent call last):
File "conversion.py", line 37, in <module>
output = frozen_func(np.zeros((432, 614, 3), np.uint8))[0].numpy()
File "/home/sorozco0612/virtual_envs/computer_vision/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 1669, in __call__
return self._call_impl(args, kwargs)
File "/home/sorozco0612/virtual_envs/computer_vision/lib/python3.6/site-packages/tensorflow/python/eager/wrap_function.py", line 247, in _call_impl
args, kwargs, cancellation_manager)
File "/home/sorozco0612/virtual_envs/computer_vision/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 1687, in _call_impl
return self._call_with_flat_signature(args, kwargs, cancellation_manager)
File "/home/sorozco0612/virtual_envs/computer_vision/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 1735, in _call_with_flat_signature
type(arg).__name__, str(arg)))
TypeError: pruned(conv2d_input): expected argument #0(zero-based) to be a Tensor; got ndarray ([[[0 0 0]
[0 0 0]
[0 0 0]
...

I guess my issue is now that I do not know how to pass data into my TensorRT model

sorozco0612 · April 20, 2021, 10:07pm

It seems my issues have extended beyond the original reason for this post, so I will mark my last post as the solution.

Topic		Replies	Views
TF-TRT Error on Jetson Nano TensorRT tensorrt , nano	2	2124	August 26, 2021
TensorRT: Cannot set bindings for dynamic shapes TensorRT	4	5611	October 12, 2021
Conversion to tensorRT error . [graphShapeAnalyzer.cpp::throwIfError::1306] Error Code 9 TensorRT jetson-inference	10	4371	May 13, 2022
I do not get any performance improvement after using TensorRT provider for object detection model Jetson Nano tensorrt , onnx	7	1417	July 12, 2022
.onnx file convert to trt got error Jetson TX2 tensorrt , jetson-inference	19	1444	January 25, 2023
Calibration failed: INTERNAL: Failed to build TensorRT engine (INT8 precision mode) in Jetson Xavier NX (16GB) Jetson Xavier NX tensorrt	9	754	April 12, 2023
Jetson-Inference predictions differ from e.g. tensorflow predictions Jetson Nano jetson-inference	4	868	November 17, 2021
ValueError: Node... Axis is not unique while converting tensorflow segmentation model to tensorrt TensorRT tensorrt , segmentation	3	1666	March 9, 2022
Inference error while using tensorrt engine on jetson nano Jetson Nano tensorrt , nvbugs	23	3679	April 20, 2022
Custom trained model on Jetson Nano Jetson Nano tensorrt	8	1845	October 15, 2021