How many HD1080p video streams can Tegra TK1 encode simultaneously at 30FPS?

Hello,

I am trying to use GStreamer1.0 and omxh264enc plugin to encode 1920x1080 video streams on Tegra TK1. When I run 2 instances on GStreamer to encode each stream independently, then I can get ~28FPS if I encode one stream and 26FPS if I encode 2 streams. But If I merge 2 HD streams into one by stacking frames on each other to get 1920x2160 resolution frame, then the streaming slows down to ~15FPS. So my questions are:

  1. Does Tegra TK1 H264 encoder has upper limit on resolution after which encoding does not run at ~30FPS?

  2. Why can’t I get to exactly 30 FPS with just 1 video stream? (I maxed out perf on CPU and GPU). The Tegra docs claim that I should be able to encode 1080 video stream at 30FPS.

  3. What is the best way (on Tegra TK1) to do the following: get 2 HD streams from 2 USB3 cameras, encode them in H264 in a synchronized fashion and stream them out? It is important for me to have synchronized frames from cameras. If I use several instances of gstreamers, then it is hard to sync frames on the receiving side.

Thanks!

Try using gstreamer0.10 to see if you get better performance. I did for my application. Not sure why yet.

The gstreamer manual has some info that might help you with synchronizing.

You can use something like this to time how long it takes to encoder 1000 frames:

time gst-launch-1.0 -v videotestsrc num-buffers=1000 ! "video/x-raw,format=(string)I420,width=1920,height=1080" ! omxh264enc ! avimux ! filesink location=/tmp/test-1.avi

Run that alone and then two at the same time and compare the results.

I’m seeing 60 fps for one stream and 57fps if I run two at the same time.

Without any encoding, I’m seeing only 77 fps:

time gst-launch-1.0 -v videotestsrc num-buffers=1000 ! "video/x-raw,format=(string)I420,width=1920,height=1080" ! fakesink

One problem with USB cameras is that they don’t output the optimal format for the video decoder so there’s an intermediate conversion step involved.

How would you synchronise two webcams on a PC? I’m not sure if there even is a way but I’m not an expert on that. I would be interested to know that as well.

Thanks for the suggestion!

I tried it and yes I am getting 62FPS when encoding one 1980x1080 video stream with default videotestsrc pattern (TV bars + small white noise windows) in h264. I get 34FPS when I encode 2048x2048 video stream (same pattern).

BUT! If I switch to “videotestsrc pattern=snow” (random white noise), then framerates drop to 27FPS for 1080HD and to 13FPS for 2048x2048 video stream.

If I do a real app (read 2 frames from 2 HD1080 cameras, combine them into 1 frame of 1920x2160 resolution (to avoid synchronization issues), then I am getting only 11FPS on the client side.

What’s interesting, if I stream 2 HD1080p streams in parallel (2 gstreamer pipelines), then I am getting about 27FPS for each stream. But then I have to synchronize them somehow (which is a problem).

Most Tegra K1 docs/drawings claim that it supports H264 encoding for 2160p30 which means 4K UHD frames at 30FPS. So I should be able to encode 1920x2160 at 30FPS. But I am not getting it.

Any suggestion would be greatly appreciated! Thanks

What is your Gstreamer command line you are using for your test? Which USB 3.0 camera are you using?

Combining the videos will cause extra read and write for the frames and that’s heavy.

If you combine the videos after you’ve got them from the camera, they are already out-of-sync, at least to a degree. For a proper synchronisation you would need some sort of clock signal that tells both cameras the exact time to start each frame (to be honest, I don’t know if something like that actually exists already).

But if you are ok with combining the frames after you get them from cameras, then you should be able to maintain that sync even without combining them. All buffers have precise timestamps and you cat get those timestamps to the RTP headers when streaming. So the other end can use those timestamps to sync them.

For simple encoding, yea it might meet what they claim.
But in this case, I think it’s more complicated, this process involves video-mixing.
The last time I checked, NV still hadn’t make video-mixing GPU-accelerated on gst-1.0 yet.
Whether your app is implemented to do this with GPU-accelerated or not, both raises doubts.

A. Implemented with GPU-accelerated
Then, the issue might be how much resource is video-mixing using.

B. Not implemented with GPU-accelerated
Then, good luck… doing video-mixing pure-software, it’s just a tragedy.
Back when I first got my hands on JTK,
I came up with a gst-pipe that mixes four 720p30fps mp4 films and play on screen.
The result was only about 8-12 fps.

Just one man’s experience and opinion.

This is going slightly off-topic I but I’ve implemented a small player that shows 4x 1080p streams on display simultaneously (or 9x 480p streams):
https://github.com/kulve/gst-multiwindow

It looks like the slowdown is actually caused by videotestsrc itself. It consumes a lot of CPU doing the snow pattern. You can time that by dropping the encoder:

time gst-launch-1.0 -v videotestsrc num-buffers=1000 ! "video/x-raw,format=(string)I420,width=1920,height=1080" ! fakesink

vs.

time gst-launch-1.0 -v videotestsrc pattern=snow num-buffers=1000 ! "video/x-raw,format=(string)I420,width=1920,height=1080" ! fakesink