Running trt.create_inference_graph, kernel restarting

sachinkm308 · October 5, 2021, 1:14pm

Description

A clear and concise description of the bug or issue.

Environment

TensorRT Version: 7.1.3-1
GPU Type: Volta GPU
Nvidia Driver Version: 32.5.1
CUDA Version: cuda 10.2
CUDNN Version: 8.0.0.180
Operating System + Version: Ubuntu 18.02
Python Version (if applicable): Python 3.6
TensorFlow Version (if applicable): 1.15.5+nv21.6
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag):

Relevant Files - I am trying to run the jupyter notebook detection.ipynb file in tf_trt_models/examples/detection at master · NVIDIA-AI-IOT/tf_trt_models · GitHub

Repository used → GitHub - NVIDIA-AI-IOT/tf_trt_models: TensorFlow models accelerated with NVIDIA TensorRT

Steps To Reproduce

After I Run the jupyter notebook file, everything works file until I run this block below

Optimize the model with TensorRT

trt_graph = trt.create_inference_graph(
    input_graph_def=frozen_graph,
    outputs=output_names,
    max_batch_size=1,
    max_workspace_size_bytes=1 << 25,
    precision_mode='FP16',
    minimum_segment_size=50
)

Below is the error I am getting

Kernel Restarting
The kernel for /tf_trt_models/detection.ipynb appears to have died. It will restart automatically.

I am not sure why it dies. Please help.

NVES · October 5, 2021, 7:38pm

Hi,
Can you try running your model with trtexec command, and share the “”–verbose"" log in case if the issue persist
https://github.com/NVIDIA/TensorRT/tree/master/samples/opensource/trtexec

You can refer below link for all the supported operators list, in case any operator is not supported you need to create a custom plugin to support that operation

github.com

onnx/onnx-tensorrt/blob/main/docs/operators.md

<!--- SPDX-License-Identifier: Apache-2.0 -->

# Supported ONNX Operators

TensorRT 8.4 supports operators up to Opset 17. Latest information of ONNX operators can be found [here](https://github.com/onnx/onnx/blob/master/docs/Operators.md)

TensorRT supports the following ONNX data types: DOUBLE, FLOAT32, FLOAT16, INT8, and BOOL

> Note: There is limited support for INT32, INT64, and DOUBLE types. TensorRT will attempt to cast down INT64 to INT32 and DOUBLE down to FLOAT, clamping values to `+-INT_MAX` or `+-FLT_MAX` if necessary.

See below for the support matrix of ONNX operators in ONNX-TensorRT.

## Operator Support Matrix

| Operator                  | Supported  | Supported Types | Restrictions                                                                                                           |
|---------------------------|------------|-----------------|------------------------------------------------------------------------------------------------------------------------|
| Abs                       | Y          | FP32, FP16, INT32 |
| Acos                      | Y          | FP32, FP16 |
| Acosh                     | Y          | FP32, FP16 |
| Add                       | Y          | FP32, FP16, INT32 |

This file has been truncated. show original

Also, request you to share your model and script if not shared already so that we can help you better.

Meanwhile, for some common errors and queries please refer to below link:

Thanks!

sachinkm308 · October 6, 2021, 9:25am

@NVES Looks like Tensor RT 7.x.x versions is not tested against TensorFlow 1.15.5(which is the version) I am using.

Probably I have to upgrade my Tensor RT to 8.x.x version(because I see it is tested against TensorFlow 1.15.5) in the release notes. Hopefully that fixes my issue…

spolisetty · October 6, 2021, 11:24am

Hope below samples may be helpful to you.
https://docs.nvidia.com/deeplearning/frameworks/tf-trt-user-guide/index.html#samples
https://docs.nvidia.com/deeplearning/tensorrt/quick-start-guide/index.html#framework-integration
https://docs.nvidia.com/deeplearning/frameworks/tf-trt-user-guide/index.html#integrate-ovr
https://docs.nvidia.com/deeplearning/frameworks/tf-trt-user-guide/index.html#usingtftrt

Topic		Replies	Views
running trt.create_inference_graph causes system lockup and kernel restart TensorRT	1	880	November 17, 2018
Inferencing of Inception_v2 on OEM server with V100. TensorRT	0	545	January 13, 2020
Tf-trt conversion got killed TensorRT tensorrt , tensorflow , jetson-inference	3	787	April 22, 2021
Inference issue using Tensorflow to TensorRT converted model Jetson TX2	2	495	October 18, 2021
No improvement in inference performance after Opt. with TensorRT TensorRT	6	1300	April 15, 2020
Error running TensorRT TensorRT	3	1432	October 12, 2021
TF-TRT issue Jetson TX2	26	4066	October 18, 2021
Tensorflow 1.7 with TensorRT fails Jetson TX2	13	3980	October 18, 2021
Couldn't get current device: unknown error TensorRT tensorrt , cuda , tensorflow	1	862	January 15, 2021
TensorRT Error: Can't identify the cuda device. Running on device 0 TensorRT tensorrt , cuda , tensorflow	3	728	January 7, 2021