How can I reduce latency on Jetson TX2?

nurieraysahin · July 14, 2020, 7:54am

Hi,

We are developing an application using PeopleNet & TrafficNet on TX2.
In our studies on deepstream, the end-to-end delay should be around 70 ms.
When I measured latency on Deepstream, I got the following results on PeopleNet for a frame:

Comp name = nvv4l2decoder0 in_system_timestamp = 1593689758799.982910 out_system_timestamp = 1593689758998.961914 component latency= 198.979004
Comp name = src_bin_muxer source_id = 0 pad_index = 0 frame_num = 1442 in_system_timestamp = 1593689758999.022949 out_system_timestamp = 1593689759070.697021 component_latency = 71.674072
Comp name = primary_gie in_system_timestamp = 1593689759070.767090 out_system_timestamp = 1593689759137.716064 component latency= 66.948975
Comp name = tracking_tracker in_system_timestamp = 1593689759137.745117 out_system_timestamp = 1593689759147.971924 component latency= 10.226807
Comp name = tiled_display_tiler in_system_timestamp = 1593689759148.038086 out_system_timestamp = 1593689759152.321045 component latency= 4.282959
Comp name = nvosd0 in_system_timestamp = 1593689759153.775879 out_system_timestamp = 1593689759153.791016 component latency= 0.015137
Source id = 0 Frame_num = 1442 Frame latency = 369.830078 (ms)

1- Is it possible to reduce the delay?
2- If possible, for which components can I reduce the delay?

AastaLLL · July 14, 2020, 9:21am

Hi,

Have you maximized the device performance first?

$ sudo nvpmodel -m 0
$ sudo jetson_clocks

Thanks.

nurieraysahin · July 14, 2020, 12:11pm

Hi,

Thanks for answer .

I provided an acceleration of 10-12 ms.

tlt models (peopleNet etc.) is faster than other examples, why is this?

Are there any changes I can make to the components to further reduce latency? or are there parameters to affect delay in config files?

thanks again for your interest.

AastaLLL · July 15, 2020, 2:12am

Hi,

The latency is related to the inference time.
You can check our TLT document for more information.

There is a ‘pruning’ steps that can remove the non-necessary layer as much as possible.
As a result, the inference time is improved and so as latency.

Our suggestion is to check if the bottleneck of the latency do come from inference part.
You can check this by running the model with trtexec directly.

$ /usr/src/tensorrt/bin/trtexec --deploy=...

Thanks.

RayZhang · July 16, 2020, 1:52am

Is it a tool to measure the delay or is it written by yourself? I want use it Thanks

AastaLLL · July 22, 2020, 4:38am

Hi,

tlt is our transfer learning toolkit.
It will retrain the model and apply the acceleration (pruning) at the same time.
That’s why tlt model can reach the smaller delay than the standard version.

Thanks.