NEON optimised VYUY2BGR

we have 6 pieces of incoming PAL video streams in YUV interleaved format, i.e. VYUY VYUY…; On tegra tx1,using the OpenCV cv::cvtColor, we can convert video frames to BGR before sending it to OpenGL for rendering; the performance is not good. So I searched the web for NEON-accelerated conversion but could not find one, as most of them are converting semi-planar YUV, or NV21, etc.

Could you provide any function to do so, or refer us to a link for the conversion?

Hi HooverLv,
Please refer to Multimedia API in Documentation
https://developer.nvidia.com/embedded/dlc/l4t-documentation-24-2-1

You can install the samples via Jetpack and please refer to 07_video_convert

Hi, DaneLLL,

Thank your for the link. I may need a bit more of your professional suggestion:
Following the link, I read in VIDEO FORMAT CONVERSION WITH GSTREAMER-1.0 of the <>, that I could use nvvidconv to convert the UYVY, but it didn’t say it support BGR output.Only I420, UYVY,NV12 raw-yuv and Gray8 are supported.

Hi HooverLv,
Your finding is correct. VYUY2BGR is not implemented in gstreamer frameworks. You have to leverage Multimedia APIs for this case.

Hi, DaneLLL,

Now I see that Multimedia APIs are lower-than Gstreamer. So I checked the reference page (24.2.1 Release). It says I can use the device “/dev/nvhost-vic” to do video conversion. It’s good to see that V4L2_PIX_FMT_UYVY is included. However, there are only 2 types of RBG format supported, V4L2_PIX_FMT_ABGR32, and V4L2_PIX_FMT_XBGR32. I don’t see the BGR888 in it.

Hi HooverLv,
Unfortunately, BGR888 is not supported. You can re-sample XBGR32 -> BGR888 via CUDA to get good performance.

Hi HooverLv,
For doing conversion via CUDA, please refer to the backend sample Of Multimedia API.
The sample does H264 -> YUV420 -> RGBA -> BGR -> object identification.
You can refer to RGBA -> BGR done via CUDA.

Hi. DaneLLL,

Thank you very much.

further issue:

I queued 6 YUV frames (from 6 threads) to the videoConverter, when the 6 callback returns, how can I tell which ARGB is from which YUV frame? Is there any data field in the V4L2_buf that I can use to map to the input YUV buffer?

Let’s go to https://devtalk.nvidia.com/default/topic/992751/jetson-tx1/multimedia-api-videoconverter/post/5080333/#5080333