the most efficient camera recording scheme

plain · August 11, 2017, 3:24am

hi,all
i want to record 720P*8(30fps) video at the same time which video data comes from 8 cameras.
now my data flow is
i) get yuyv from camera via v4l2
ii) convert yuyv to yuv420 via L4T Multimedia API NvVideoConverter(is this implemented by GPU?)
iii) encode yuv420 to h265 via L4T Multimedia API NvVideoEncoder
iv) callback and write record file

it’s ok for tx1 to handle less than 4 cameras. framerate can be stable at 30fps. encode time is about 10ms per frame.
(time for dqBuffer + fill yuv data into NvBuffer + qBuffer)
but when camera increases to 6, the encode time instable and sometimes it’s as long as 40ms.
(also time for dqBuffer + fill yuv data into NvBuffer + qBuffer)
it’s even worse for 8 cameras working at the same time.
as a result, the framerate cann’t be 30fps.

so i wonder if there is a better scheme for video recording?
reduce the memcpy times between arm and audio/video process or GPU.

DaneLLL · August 14, 2017, 1:37am

Hi plain, are you on r24.2.1? Or r28.1?

plain · August 14, 2017, 2:16am

hi DaneLLL

# head -n 1 /etc/nv_tegra_release

i get R24.2.1

DaneLLL · August 14, 2017, 7:35am

For r24.2.1, please refer to
[url]https://devtalk.nvidia.com/default/topic/994281/jetson-tx1/v4l2-video-encoder-performance-/post/5090266/#5090266[/url]

plain · August 16, 2017, 8:57am

Thanks a lot. it’s working better now at ultra fast mode.
But sometimes there also can be several abnormal ones which takes at about 30-40ms to encode per frame.

I wonder if there is a better way recording videos?
Reduce the memcpy times between arm and audio/video process or GPU.
Now cpu usage is totally 260% (100%+100%+60%. About 33% per camera.) and the cpu load average is about 6.5(TX1 has 4 cores， every core runs at 1.734GHz)

wish for some advice. 3Q.