TensorFlow-TensorRT inference time and memory consumption on Nano

arunponnusamy · July 15, 2019, 7:37am

Hi,

I am trying to run inference of an image classification model (ResNetV2) with TensorRT optimized graph (FP32&FP16) using TensorFlow on Jetson Nano.

While running the inference script, it waits for minutes before starting the inference and after starting, first iteration takes lot of seconds (loop has only sess.run). After the first iteration inference time comes down to milliseconds. Meanwhile the memory consumption is close to 3.4 GB out of 4 GB.

Is this behaviour expected ? Does TenorRT optimize only inference time not memory usage ?
What are the best practices to reduce memory consumption on Jetson Nano ?

I am using the imagenet pretrained ResNetV2 frozen graph from here https://github.com/tensorflow/models/tree/r1.13.0/research/tensorrt#model-links for TensorRT conversion and using the official imagenet_preprocessing script from here https://github.com/tensorflow/models/blob/r1.13.0/official/resnet/imagenet_preprocessing.py for preprocessing the image.

TensorFlow: 1.13.1
TensorRT: 5.0.6

inference snippet :

def preprocess_image(file_name, output_height=224, output_width=224,
                     num_channels=3):

  image_buffer = tf.read_file(file_name)
  normalized = imagenet_preprocessing.preprocess_image(
      image_buffer=image_buffer,
      bbox=None,
      output_height=output_height,
      output_width=output_width,
      num_channels=num_channels,
      is_training=False)
  
  with tf.Session() as sess:
    result = sess.run([normalized])

  return result[0]

image = preprocess_image(INPUT_IMAGE_PATH)
image = np.expand_dims(image, axis=0)
print(image.shape)

graph = tf.Graph()
with graph.as_default():

    graph_def = tf.GraphDef()

    with tf.gfile.GFile(PATH_TO_FROZEN_GRAPH, 'rb') as f:
        serialized_graph = f.read()
        graph_def.ParseFromString(serialized_graph)
        tf.import_graph_def(graph_def, name='')

with graph.as_default():

    config = tf.ConfigProto()
    config.gpu_options.per_process_gpu_memory_fraction = GPU_MEM_FRACTION # 0.5
    with tf.Session(config=config) as sess:

        input_image = tf.get_default_graph().get_tensor_by_name('input_tensor:0')
        softmax_predictions = graph.get_tensor_by_name('softmax_tensor:0')

        # warmup
        for i in range(5):
            start = time.time()
            predictions = sess.run(softmax_predictions,
                                   feed_dict={input_image: image})
            end = time.time()
            print(end - start, " seconds")

            idx = np.argmax(predictions[0])

            print(predictions[0][idx])

        time_buffer = []
        for i in range(100):

            start = time.time()
            predictions = sess.run(softmax_predictions,
                                   feed_dict={input_image: image})
            end = time.time()
            print(end - start, " seconds")

            time_buffer.append(end - start)

            idx = np.argmax(predictions[0])

            print(predictions[0][idx])

        print("Average: ", np.mean(np.array(time_buffer)))

Let me know your thoughts.
Thanks,
Arun.

AastaLLL · July 16, 2019, 3:16am

Hi,

TensorFlow doesn’t apply specific optimization for Jetson and won’t give you the best performance.

It’s recommended to use our pure TensorRT rather than TF-TRT on a Jetson system.
Here is a tutorial and some benchmark results for your reference:
[url]https://github.com/NVIDIA-AI-IOT/tf_to_trt_image_classification[/url]

Thanks.

Topic		Replies	Views
TensorRT Optimization for Tensorflow-Unet-Image-segmentation TensorRT tensorrt , tensorflow , nano	1	1163	August 4, 2021
Optimize TF-TRT models on Jetson Nano to improve inference timing and efficiency Jetson Nano	4	1407	October 18, 2021
TensorFlow object detection inference out of memory Jetson Nano	7	3036	October 18, 2021
TensorRT Optimization for Tensorflow-Unet-Image-Semantic-segmentation Jetson TX2 tensorrt	3	1461	August 9, 2021
Low FPS on Jetson Nano using TensorRT Jetson Nano tensorrt , tensorflow	7	1204	August 27, 2020
Optimize Tensorflow with Tensor RT to improve inference timing Jetson Nano	2	637	October 18, 2021
converting a frozen graph to tensorRT Jetson Nano	5	1788	October 14, 2021
optimizing tf-trt load time Jetson Nano	12	4168	October 15, 2021
Slow inference on jetson TX2 with tensorflow Jetson TX2	2	599	October 18, 2021
TensorRT frames processing speed increases with increase in number of frames Jetson Nano tensorrt	4	809	October 15, 2021

TensorFlow-TensorRT inference time and memory consumption on Nano

Related topics