Unable to run optimised network on Tx2 using tensorflow gpu

pranay731 · October 4, 2019, 6:45am

I have been trying to run a segmentation network on tx2, which gives memory allocation error primarily, but changing different options in tensorflow just pops different error, somehow related to memory allocation only. I am able to run the network on TX2 using CPU only mode at a rate of 1 frame per 10 seconds, but even after graph transform as well as tensorrt optimization its the same error.

Can you provide any insights?

My Tensorflow version is 1.8 running on Ubuntu 16.04 with Cuda 9.

AastaLLL · October 4, 2019, 9:50am

Hi,

It’s recommended to upgrade the system into our latest JetPack software first.

Suppose you are using TF-TRT, is it correct?
TF-TRT tends to allocate twice memory(one for CPU and the other for GPU) in GPU mode, which usually leads to OOM error.
It’s recommended to use pure TensorRT instead.
https://github.com/NVIDIA-AI-IOT/tf_to_trt_image_classification

Thanks.

pranay731 · October 15, 2019, 8:40am

Hi AastaLLL,

Thanks for your input. To try your recommendation, I installed Jetpack 4.2 on a TX2, cuda 10 and every thing. Now I am unable to find a Tensorflow-gpu for 4.2.2 . I want to work with Python2 due to ROS compatibility. Could you help me with that?

Also, the OOM error occurs without trt optimization. I froze the graph and tried to run it before optimizing, which yields the same error.

Thanks.

AastaLLL · October 30, 2019, 8:11am

Hi,

We provide some prebuilt TensorFlow package for Jetson users.
Please check this document for information:
[url]Installing TensorFlow for Jetson Platform :: NVIDIA Deep Learning Frameworks Documentation

If you are facing memory issue, it’s recommended to use pure TensorRT instead.
TF-TRT by default duplicate the pipeline for both TensorFlow and TensorRT, which consumes lots of memory.

Thanks.

pranay731 · November 7, 2019, 3:42am

Hi,

Thank you for looking into the issue.
I updated my TX2 with jetpack 4.2, Cuda 10 and tensorflow 1.14, converted my script to python3 and build cv_bridge with ros to support python 3 and now the network is running absolutely fine by loading weights from ckpt method. Even without optimizing for inference or graph transforms, its able to load and run, which makes me think that there is some issue in CUDA 9 memory allocation technique as now no OOM error is present.
Also, I was facing some issues in graph transforms or freezing the due to batch norm in the network, which is still there, as after the transform or freezing, when loading graph from .pb file, error occurs that float expected but float_ref passed, which I assume is a tensorflow error. I understand this isn’t the correct place to ask this but if you have any idea over this, please do let me know.
Thanks again.

Topic		Replies	Views
GPU out of memory with TensorFlow and JetPack 4.2 Jetson TX2	6	1781	October 18, 2021
Migrates SSD Mobilenet v1 from TF1 to TF2, consumes too much GPU memory TensorRT cuda	3	757	February 21, 2022
TensorRT 3.0.4 image classification example with JetPack 3.2 on Jetson TX2 giving ResourceExhaustedError Jetson TX2	4	833	October 18, 2021
Tensorflow does not work on Jetson without "output"!! runs for long time-no result Jetson TX2	3	618	October 18, 2021
SSD: functioned well on CPU but failed on GPU Jetson TX2	7	920	October 18, 2021
TensorFlow Issue - 'NonMaxSuppressionV3' in binary Jetson TX2	16	3287	October 18, 2021
Could not allocate memory: Tensorflow 1.5 on python 3 for Jetson TX2 Jetson TX2	4	1976	October 18, 2021
trouble with Tensorflow and TX2. Jetson TX2	1	1929	March 1, 2018
Tensorflow 1.6 not working with Jetpack 3.2 Jetson TX2	25	7234	October 18, 2021
fail to run tensorflow1.5 in tx2 Jetson TX2	3	723	February 12, 2018

Unable to run optimised network on Tx2 using tensorflow gpu

Related topics