Run a UNet segmentation model on Jetson Nano / Convert pb to TensorRT

camille · October 5, 2020, 2:29pm

Hello,

I trained a U-Net segmentation model with Keras (using TF backend).
I am trying to convert its frozen graph (.pb) to TensorRT format on the Jetson Nano but the process is killed (as seen below). I’ve seen on other posts that it could be related to an « out of memory » problem.
To be known, I already have an SSD MobileNet V2 model running on the Jetson Nano.
If I stop the systemctl, I can make inference with the U-Net model without converting it to TensorRT (just using the frozen graph model loaded with Tensorflow). As this way doesn’t work when I start the systemctl (so when the other neural network is running), I try to convert my U-Net segmentation model to TensorRT to get an optimized version of it (which failed because of a killed process), but it may not be the right way to do this.

Is it possible to run two neural networks on a Jetson Nano ? Is there any other way to do this ?

Any help would be greatly appreciated.

Thanks a lot

For information, here is the way I try to convert the frozen graph to TensorRT :

trt_graph = trt.create_inference_graph(
    input_graph_def=frozen_graph_gd, # Pass the parsed graph def here
    outputs=['conv2d_24/Sigmoid'],
    max_batch_size=1,
    max_workspace_size_bytes=1 << 32, # I have tried 25 and 32 here
    precision_mode='FP16'
)

And here is when the process is killed (conversion of the U-Net frozen graph to TensorRT) :

2020-10-05 16:00:58.200269: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.2

WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.

2020-10-05 16:01:11.976893: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libnvinfer.so.7

2020-10-05 16:01:11.994472: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libnvinfer_plugin.so.7

WARNING:tensorflow:

The TensorFlow contrib module will not be included in TensorFlow 2.0.

For more information, please see:

* https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md

* https://github.com/tensorflow/addons

* https://github.com/tensorflow/io (for I/O related ops)

If you depend on functionality not listed there, please file an issue.

WARNING:tensorflow:From convert_pb_to_tensorrt.py:14: The name tf.GraphDef is deprecated. Please use tf.compat.v1.GraphDef instead.

2020-10-05 16:01:13.678101: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libnvinfer.so.7

2020-10-05 16:01:15.506432: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcuda.so.1

2020-10-05 16:01:15.512224: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:952] ARM64 does not support NUMA - returning NUMA node zero

2020-10-05 16:01:15.512359: I tensorflow/core/grappler/devices.cc:55] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 0

2020-10-05 16:01:15.512638: I tensorflow/core/grappler/clusters/single_machine.cc:356] Starting new session

2020-10-05 16:01:15.532712: W tensorflow/core/platform/profile_utils/cpu_utils.cc:98] Failed to find bogomips in /proc/cpuinfo; cannot determine CPU frequency

2020-10-05 16:01:15.533264: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x328fd900 initialized for platform Host (this does not guarantee that XLA will be used). Devices:

2020-10-05 16:01:15.533318: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version

2020-10-05 16:01:15.632451: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:952] ARM64 does not support NUMA - returning NUMA node zero

2020-10-05 16:01:15.632757: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x30d0edb0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:

2020-10-05 16:01:15.632808: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): NVIDIA Tegra X1, Compute Capability 5.3

2020-10-05 16:01:15.633163: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:952] ARM64 does not support NUMA - returning NUMA node zero

2020-10-05 16:01:15.633276: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1634] Found device 0 with properties: 

name: NVIDIA Tegra X1 major: 5 minor: 3 memoryClockRate(GHz): 0.9216

pciBusID: 0000:00:00.0

2020-10-05 16:01:15.633348: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.2

2020-10-05 16:01:15.633500: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.10

2020-10-05 16:01:15.716786: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10

2020-10-05 16:01:15.903326: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10

2020-10-05 16:01:16.060655: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10

2020-10-05 16:01:16.141950: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusparse.so.10

2020-10-05 16:01:16.142219: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.8

2020-10-05 16:01:16.142553: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:952] ARM64 does not support NUMA - returning NUMA node zero

2020-10-05 16:01:16.142878: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:952] ARM64 does not support NUMA - returning NUMA node zero

2020-10-05 16:01:16.142991: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1762] Adding visible gpu devices: 0

2020-10-05 16:01:16.143133: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.2

2020-10-05 16:01:27.700226: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1175] Device interconnect StreamExecutor with strength 1 edge matrix:

2020-10-05 16:01:27.700377: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] 0 

2020-10-05 16:01:27.700417: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1194] 0: N 

2020-10-05 16:01:27.713559: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:952] ARM64 does not support NUMA - returning NUMA node zero

2020-10-05 16:01:27.713897: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:952] ARM64 does not support NUMA - returning NUMA node zero

2020-10-05 16:01:27.714101: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1320] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 200 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X1, pci bus id: 0000:00:00.0, compute capability: 5.3)

Killed

AastaLLL · October 6, 2020, 3:32am

Hi,

Please noticed that the TensorRT optimizer embedded in TensorFlow frameworks consumes lots of memory.
To allow operation fallback, it will create an engine for TensorFlow and the other for TensorRT.
As a result, it is easy to reach Nano memory limitation, which is 4GB.

To solve this, would you mind to allocate some swap memory to see if helps?
For more detail, please check this comment:

Thanks.

camille · October 9, 2020, 2:58pm

Thank you for your answer, it works better with swap memory.

Topic		Replies	Views
TensorRT optimization random outcome Jetson Nano	5	860	October 15, 2021
converting a frozen graph to tensorRT Jetson Nano	5	1846	October 14, 2021
Custom U-net model: convert, load and inference Jetson Nano	5	1019	October 15, 2021
Tf-trt Jetson Nano - process killed - conversion running out of memory? Jetson Nano tensorrt , tensorflow	5	1372	October 18, 2021
Converting tensorflow pb model file to tensorrt GPU memory error TensorRT	2	1179	October 17, 2019
How to convert Tensorflow model to Tensorrt? Jetson Nano tensorrt , tensorflow	8	2476	October 15, 2021
jetson-inference with custom model Jetson Nano	10	3516	October 15, 2021
MTCNN tensorflow into tensorflow tensorRT on Jetson Nano Jetson Nano	2	1190	October 14, 2021
Trying to convert a Darknet_YOLOv3 frozen graph (.pb) from TensorFlow to TensorRT on Jetson AGX Xavier Jetson AGX Xavier tensorrt , yolo	8	1948	October 18, 2021
Memory Issues and Conversion issues with TF-TRT on Nano Jetson Nano tensorrt	8	1615	October 18, 2021

Run a UNet segmentation model on Jetson Nano / Convert pb to TensorRT

Related topics