Converting tf model on jetson tx2 is slow

kuskov.stanislav · June 2, 2020, 2:43pm

Description

I try to convert tensorflow classification model to tensorRT.
At first conversion process was killed (memory is full) and I increase swap file size.
After it started but is still running more them 2 hours and not finished now.
Memory filled up full and swap on 4.5 gb

My example based on:

print('Converting to TF-TRT FP16...')
conversion_params = trt.DEFAULT_TRT_CONVERSION_PARAMS._replace(
    precision_mode=trt.TrtPrecisionMode.FP16,
    max_workspace_size_bytes=4000000000)
converter = trt.TrtGraphConverterV2(
   input_saved_model_dir='resnet50_saved_model', conversion_params=conversion_params)
converter.convert()
converter.save(output_saved_model_dir='resnet50_saved_model_TFTRT_FP16')
print('Done Converting to TF-TRT FP16')

Environment

GPU: jetson tx2
Operating System + Version: Jetpack 4.3

SunilJB · June 2, 2020, 4:49pm

Can you try verbose logging in TRT and share the verbose log?
Also, please try to run the TF model directly to check the performance/memory consumption before conversion.

Thanks

kuskov.stanislav · June 2, 2020, 8:24pm

Here is my console output. 1.txt (16.1 KB)
Conversion finished after more them 3 hours.

How to run model directly?

PS. on laptop example works well

SunilJB · June 8, 2020, 5:19am

It may be due to available GPU memory
Can you check system memory via “$ sudo tegrastats” to see if the reach the memory bound?

Thanks

kuskov.stanislav · June 8, 2020, 8:12am

I try to compole lightweight model (mobilenet) and it works well.

I check memory after first optimizer operation

RAM 7570/7861MB (lfb 8x4MB) SWAP 911/3930MB (cached 53MB) CPU [2%@345,1%@2029,93%@2028,0%@345,0%@345,1%@345] EMC_FREQ 1%@1866 GR3D_FREQ 0%@114 APE 150 MTS fg 0% bg 3% PLL@52C MCPU@52C PMIC@100C Tboard@47C GPU@49.5C BCPU@52C thermal@51.2C Tdiode@49.5C VDD_SYS_GPU 94/140 VDD_SYS_SOC 851/869 VDD_4V0_WIFI 0/10 VDD_IN 4641/4715 VDD_SYS_CPU 1324/1493 VDD_SYS_DDR 1303/1220

I tensorflow/core/grappler/optimizers/meta_optimizer.cc:841] Optimization results for grappler item: graph_to_optimize
2020-06-08 11:10:11.369224: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:843] function_optimizer: Graph size after: 3089 nodes (2543), 6797 edges (6249), time = 413.735ms.
2020-06-08 11:10:11.369284: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:843] function_optimizer: function_optimizer did nothing. time = 7.225ms.

I try to resize swap to 9 gb, but its allocate all memory and swap.

SunilJB · June 15, 2020, 6:13am

Swap space cannot be used by TensorRT.
It seems model is using almost all the GPU memory, hence performance is slow.

Thanks

kuskov.stanislav · June 15, 2020, 8:33am

So, i have model which use near 4 Gb GPU memory. Its work fine on jetson tx2.
What happens when i try to convert it? its needed more memory? Stay larger? Why not enough memory?

Which way better? Convert on another device? ONNX?

SunilJB · June 18, 2020, 7:03am

Can you try TF → ONNX → TRT workflow?
You may have to create a custom layer for unsupported layer.

After generating ONNX model you can even use trtexec command line tool to quickly test and generate the TRT model
https://github.com/NVIDIA/TensorRT/tree/master/samples/opensource/trtexec

Thanks

kuskov.stanislav · June 18, 2020, 10:08am

So, i try onnx for yolo model convert (default example in tensorrt). It works ok. When i try to convert it from onnx to tensorrt with trtexec its give me memory error:

[06/18/2020-12:01:57] [I] [TRT] Some tactics do not have sufficient workspace memory to run. Increasing workspace size may increase performance, please check verbose output.
Killed

cmd:

./trtexec --onnx=/home/kuskovtx2/yolo_onnx/fp16/yolov3.onnx

I finally convert the model with sample from /usr/src/tensorrt/samples/python/yolov3_onnx. Its works ok.

My general question now why one example works for yolo and other not works? Maybe memory using difference?

SunilJB · June 19, 2020, 5:18am

kuskov.stanislav:

[06/18/2020-12:01:57] [I] [TRT] Some tactics do not have sufficient workspace memory to run. Increasing workspace size may increase performance, please check verbose output.
Killed

Can you share the verbose log along with model so we can help better?

In this case,did you used just the “onnx_to_tensorrt.py” to convert your onnx model or you are referring to successful execution using complete sample code?
Can you compare both the onnx model to check the similarity?

Thanks

kuskov.stanislav · June 23, 2020, 1:48pm

For convert to onnx I use /usr/src/tensorrt/samples/python/yolov3_onnx.py
For convert to tensorrt i use:

/usr/src/tensorrt/samples/python/onnx_to_tensorrt.py (work well)
./trtexec --onnx=/home/kuskovtx2/yolo_onnx/fp16/yolov3.onnx (Some tactics do not have sufficient workspace memory to run. Increasing workspace size may increase performance, please check verbose output.
Killed) verbose log: verbose_log.txt (344.8 KB)

SunilJB · June 23, 2020, 5:35pm

Could you please share the ONNX model so we can reproduce the issue?
Meanwhile, could you please try using latest TRT 7 release (Jetpack 4.4)?

Thanks

kuskov.stanislav · June 23, 2020, 6:04pm

So, i’m use ./trtexec and its works. I’m just run my jetson without of display. Sometimes process use swap memory, but not more then 2 GB.

Onnx model here:
https://drive.google.com/file/d/10iNpiNmQrcC0NlbV1NGWq5jsgVxB2CBq/view?usp=sharing

SunilJB · June 24, 2020, 5:56am

Does this means issue resolved by running jetson without display?

Thanks

kuskov.stanislav · June 24, 2020, 12:58pm

Yes, the issue was resolved!

Topic		Replies	Views
Memory error for tensorRT model on TX2 Jetson TX2 tensorrt	6	1559	January 5, 2022
Memory Issues and Conversion issues with TF-TRT on Nano Jetson Nano tensorrt	8	1662	October 18, 2021
Error Converting model to tensor RT Jetson Nano tensorrt , tensorflow	3	753	October 15, 2021
Trt_convert converter.convert() gets killed without errors Jetson Xavier NX tensorrt	8	2350	October 18, 2021
TRT conversion on Jetson AGX Jetson AGX Xavier tensorrt	6	701	October 18, 2021
Loading TensorRT model is very slow on Jetson Nano Jetson Nano tensorrt , tensorflow , jetson-inference , python	5	2768	October 15, 2021
TensorFlow GPU device created with only 1591MB memory (or is it 3.87GiB?), despite there being over 20GB available Jetson Nano tensorflow , tf-trt	2	2817	June 25, 2021
Jeston Nano 2GB Out of Memory With ONNX->TensorRT Conversion Jetson Nano tensorrt , nano2gb	2	1042	October 15, 2021
How to convert Tensorflow model to Tensorrt? Jetson Nano tensorrt , tensorflow	8	2533	October 15, 2021
PyTorch model converting to TensorRT issue Jetson TX2	4	1171	October 18, 2021

Converting tf model on jetson tx2 is slow

Description

Environment

Related topics