Nvoverlaysink latency big


On tx2 4g R32.2, with our custom-designed board. when we capture live from a camera and display the video with nvoverlaysink, the glass-to-glass delay is very big compared to xvimagesink.

the test command for nvoverlaysink:
gst-launch-1.0 mycamerasrc device=/dev/video0 ! ‘video/x-raw,width=1920,height=1080,format=YUY2’ ! nvvidconv ! ‘video/x-raw(memory:NVMM),width=1920,height=1080,format=NV12’ ! nvoverlaysink sync=false display-id=1
glass-to-glass delay = 460ms;
playback is smooth;

the test command for xvimagesink:
gst-launch-1.0 mycamerasrc device=/dev/video0 ! ‘video/x-raw,width=1920,height=1080,format=YUY2’ ! xvimagesink sync=false
glass-to-glass delay = 248ms;
playback is a bit sluggish;

we prefer to use nvoverlaysink as it is smooth, but how can we reduce its latency to, 200ms, like in xvimagesink; or even something less than 150ms like on the devkit board?

E,g, on devkit board the following command :
gst-launch-1.0 nvarguscamerasrc ! ‘video/x-raw(memory:NVMM),width=1920,height=1080,format=NV12,framerate=30/1’ ! nvoverlaysink sync=false
glass-to-glass latency is 121 ms;
playback is smooth.

p.s. all test case executed after jeston_clock.sh and 6 cores are at 2GHz.

Please execute sudo jetson_clcoks and check again. There is a mamcpy of copying CPU buffer to NVMM buffer in nvvidconv:

‘video/x-raw,width=1920,height=1080,format=YUY2’ ! nvvidconv ! ‘video/x-raw(memory:NVMM),width=1920,height=1080,format=NV12’

Running CPUs in max clocks should bring improvement. You may also try other power mode such as sudo nvpmodel-m 0. Please refer to developer guide.

yes, we have tried MaxP and MaxN with no improvement.

Looks like it is a v4l2 source. You may check if you can get lower latency in running jetson_multimedia_api 12_camera_v4l2_cuda. The sample eliminates the memcpy and is an optimal solution for v4l2 sources.

It’s the camerasrc issue. After we replace it with an appsrc, the latency dropped to 180ms.