Accelerated Video Playback on Jetson TK1

We’re currently evaluating the video decoding capabilities of the TK1 (using LfT R19.3), and are not too impressed so far. The only way to get decent performance with acceptable CPU load looks like:

gst-launch-0.10 filesrc location="$1" ! qtdemux name=demux demux.video_00 ! \
  queue ! nv_omx_h264dec ! nv_omx_hdmi_videosink

This allows to decode a single 4k video stream (the “Roast Duck” demo by Red) and output it over HDMI (no xorg involved). Trying to play the same video in an X window using

nvgstplayer -i <filename>

Gives ~1 fps (tried the 0.10 and 1.0 versions; also tried under Ubuntu desktop with compiz and lxde without any compositor; all the same). Is 4k playback under xorg supported? And if so, how can we get it going?

Another problem that’s troubling us is playing back multiple 1080p video streams simultaneously. Playback of up to 4 streams under xorg is kind-of acceptable, with all 4 cores maxed out and occasional drops.

One goal is to play 4-6 streams concurrently in a grid, and we’ve tried several gstreamer pipelines to test this. One route might be something along the lines of

gst-launch \
   nv_omx_videomixer name=mix ! nv_omx_hdmi_videosink \
   filesrc location="big_buck_bunny_1080p_h2642.mov" ! queue ! \
     qtdemux name=demux1 demux1.video_00 ! nv_omx_h264dec ! mix. \
   filesrc location="big_buck_bunny_1080p_h2642.mov" ! queue ! \
     qtdemux name=demux2 demux2.video_00 ! nv_omx_h264dec ! mix.

However, we’re unable to get the omxh264dec talk to the nv_omx_videomixer (“could not link” error). Throwing in a nvvidconv does not seem help. And even if it would, it’s still puzzling how to specify the video offsets / sizes with the nv_omx_videomixer. The port parameters from the original gstreamer videomixer (“xpos”, “ypos”, etc) seem to be unsupported by nv_omx_videomixer.

Simply replacing the nv_omx_videomixer with said standard gstreamer component actually does get something on the screen, but again with horrible performance (which is not entirely surprising, because it’s processing lots of pixels using the CPU). Also, we were unable to find a configuration which gives correct colors. It seems that at some place in the pipeline YUV is interpreted as RGB(A), but we did not bother to figure out where because the performance is not acceptable anyway.

So, is there any chance to play several videos in a grid like arrangement?

Kodi (XBMC) is able to play 4k so it’s is proven to work. But Kodi uses OpenMAX directly and officially only GStreamer is supported.

Did you try using playbin2 with the gst-launch? I haven’t really checked the performance but at least it did play 1080p in an X.Org window fluently for me.

For fluent 4k playback in Kodi, both the CPU and GPU clocks had to be manually bumped up:
http://elinux.org/Jetson/Performance

If the videomixer plugin isn’t efficient enough, maybe you need to implement an application (in C or maybe in python) that creates 4 X11 windows and uses the GstXOverlay interface for setting the streams to those windows? I think that should be an efficient way of doing it. Kodi uses OpenGL for scaling and rendering and shaders for color space conversion but that approach may need quite a lot of implementation work if done from scratch.

We’ve stumbled upon the “nvgstplayer-1.0_README.txt” file yesterday, and doing an

sudo apt-get install gstreamer1.0-tools gstreamer1.0-alsa \
  gstreamer1.0-plugins-base gstreamer1.0-plugins-good \
  gstreamer1.0-plugins-ugly gstreamer1.0-plugins-bad gstreamer1.0-libav

as suggested there actually allows nvgstplayer-1.0 application to play videos (including 4k material) under X with acceptable performance. It’s still unclear which library exactly was missing before, but that’s also not terribly interesting at this point. But maybe it’s interesting to know that following similar instructions for the nvgstplayer-0.10 does not seem to have any impact on performance under X. But it’s sufficient for us if gstreamer 1.0 works, so…

I assume that at this point gstreamer 1.0 is the preferred way to make use of the video acceleration capabilities, and 0.10 is considered deprecated and should not be used for new developments. Can anyone confirm this?

I actually have exactly the program you’re describing lying around from another project, where we’re only decoding SD streams on older Nvidia hardware. Because of the findings outlined above, it seems feasible to adapt it to the Tegra platform which seems to be capable to decode a sufficient number of HD streams and call it done.

We have some experience with (and code using) EGL / GLES / OpenMAX on other platforms and had some hopes to make use of it on Tegra, thus getting rid of X altogether. The existence of the “cube_texture_and_coords” example in “gstomx1_src.tbz2” seems to indicate that it’s possible, but otherwise there is not much documentation about this and we’re not sure if it’s a good idea to go this route.

I’m not sure which is the recommended: 0.10 or 1.0. I’ve used 0.10 previously but I’m currently learning the new stuff in 1.0 to try it out for a simple streaming case.

If you have something for older Tegras, it might work by simply recompiling. I think the same APIs are still supported.

I think the the EGL/GLES works only with X.Org, so that might not be the correct approach if you don’t want to use X.Org.

Hi waldheinz,
You can use following pipeline to play mupltiple 1080p videos in windowed mode:

gst-launch-0.10 filesrc location=“$1” ! qtdemux name=demux demux.video_00 !
queue ! nv_omx_h264dec ! queue ! nveglglessink

nvgstplayer-1.0 will support window position specification by user from the next release forward.

Meanwhile you can use nvgstplayer with the above sink and resize and move playback windows:

nvgstplayer -i --svs=“nveglglessink” //you can also specify nv_omx_hdmi_videosink here.

Btw, how are you measuring fps?
I would suggest using nvgstplayer with --stats to measure it.

Let me know if this helps and feel free to ask any questions if you have them.
Regards,
Rahool

Thanks Rahool, the nveglglessink really makes a huge difference compared to the default xvimagesink when looking at CPU utilization.

But we have one problem with this solution, as we routinely want to “crop” the shown h264 streams. We could emulate the desired functionality by simply moving the X11 windows partly off-screen, so that only the desired part remains visible. This approach is a tad too hacky for my taste, and it does not work at all when using the nveglglessink: When moving the window out to the left, the playback stops.

Still, there is hope. The name of that sink suggests, that there is EGL at work under the hood, and I assume the EGLImage extension specifically. So it seems possible to use a gstreamer pipe ending in a nveglglessink for decoding, and then use OpenVC or OpenGL(ES) to do the composition of the final image. Is this supported by NVidia?

I’ve just had brief a look at the “cube_texture_and_coords” example from the “gstomx1_src” package, but fail to fully understand how the gstreamer → OpenGL handoff works. Using a nullsink and then blindly casting the buffers to (GstEGLImageMemory*) (in the update_image function, line 644 ff.) seems unsafe. Could something similar be done using the nveglglessink with “create-window” set to false?

Honestly, we didn’t actually measure the fps but just looked at it and thought “oh crap”. :-) For future measurements we’ll use the nvgstplayer --stats where due. With nveglglessink we can decode 60fps 1080p streams without problems, which gives enough headroom for our needs at this time.

Thanks again for your valuable input,
-Matthias

Good to know it helped.
As far as window positioning and size is concerned, the new release will have that support.
But finally it is the window manager who does the compositing so to get windows to be positioned as you want, you will need to switch the window manager off.
Window sizing will work from command line as expected with or without the window manager.

And yes, there is EGL at work there. I think the new release will have added support for the scenario you are describing but I am not completely sure about it.

BTW, nvgstplayer is by far the most flexible gstreamer based player to have ever existed. Trying --help with it will take you places! ;)

Regards,
Rahool

Thank you for the information. I have an additional question is nvgstplayer able to read live streams (streams that have no length an can run “eternally”)?

Best regards.