Optimize Inference Time of yolov2 model on Jetson Nano NX

xela1601 · June 15, 2022, 11:31am

Hello there,

I’m currently working in a school project, where we’re using machine learning methods in order to develop autonomous driving toy cars. These are equipped with lots of hardware & sensors which communicate through ros packages. My task inside this team is to optimize the inference time of an already trained (tiny-)yolo tensorflow model, which is used for object detection on our newly purchased hardware (NVIDIA Jetson Xavier NX). This model already runs with an inference of about 30ms on x86 hardware (without TensorRT). However, on our new ARM based hardware the inference time is about 100ms, which is too bad and expected to be a lot better. The object detection runs inside a docker container (using the nvcr.io/nvidia/l4t-tensorflow:r32.7.1-tf2.7-py3 image from NGC) as a ros package. I already ensured, that the docker container uses the NVIDIA container runtime and therefore has access to CUDA.
So my next step was to make use of tensorRT to optimize the inference time. Only Problem was that the tensorflow version of the used docker image was built without tensorRT. Currently I’m trying to do the conversion to the tensorRT model outside the docker container.

I was just wondering, if I’m missing something or you have any ideas/resources that could help me achieving my goal.

Output of $jtop:
NVIDIA Jetson Xavier NX (Developer Kit Version) - Jetpack UNKNOWN [L4T 32.7.2]

Type: Xavier NX (Developer Kit Version)
SOC Family: tegra194 ID: 25
Module: P3668 Board: P3509-000
Code Name: jakku
Cuda ARCH: 7.2
Serial Number: 1423421015790

Libraries: - Hostname: xavier-nx
- CUDA: 10.2.300 - Interfaces:
- OpenCV: 4.1.1 compiled CUDA: NO * wlan0: 192.168.178.155
- TensorRT: 8.2.1.8 * docker0: 172.17.0.1
- VPI: ii libnvvpi1 1.2.3 arm64 NVIDIA Vision Programming Interface library * br-6999ad3e172.19.0.1
- VisionWorks: 1.6.0.501 * br-d7446ff7172.18.0.1
- Vulkan: 1.2.70
- cuDNN: 8.2.1.32

Thanks already
Alex

NVES · June 15, 2022, 12:07pm

Hi,

This looks like a Jetson issue. Please refer to the below samples in case useful.

For any further assistance, we will move this post to to Jetson related forum.

Thanks!

system · June 29, 2022, 12:08pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
TensorRT Optimization for Tensorflow-Unet-Image-segmentation TensorRT tensorrt , tensorflow , nano	1	1176	August 4, 2021
What Object Detection model is better or most used with embedded Nvidia devices (Jetson Xavier NX)? TensorRT tensorrt , jetson-inference	1	994	November 30, 2021
Nvidia Jetson NX extremely slow even with TensorRT inference for yolov3 TensorRT	3	1205	August 23, 2021
Best practice inference of TensorFlow bbject detection models on Jetson devices Jetson Xavier NX tensorflow	4	1375	March 24, 2022
Low GPU Usage with Tensorflow Inference on Jetson Tx2 Jetson TX2	13	4446	October 18, 2021
Nvidia Jetson NX extremely slow even with TensorRT inference for yolov3 Jetson Xavier NX tensorrt	21	2579	October 18, 2021
TF-TRT optimization TensorRT tensorrt , tensorflow , jetson-inference	4	4952	June 2, 2021
Object detection models are very slow Jetson TX2	5	1464	October 18, 2021
optimizing tf-trt load time Jetson Nano	12	4186	October 15, 2021
Optimize Tensorflow with Tensor RT to improve inference timing Jetson Nano	2	639	October 18, 2021

Optimize Inference Time of yolov2 model on Jetson Nano NX

Related topics