Jetson TK1 - OpenCV4Tegra - YUV to RGBA conversion


I am testing OpenCV4Tegra on my Jetson TK1.
I want to convert YUV bufer to RGBA in efficient way.
So I use this line with success:

cv::cvtColor(src, dst, (int)cv::COLOR_YUV2BGRA_Y422);

But the performance is not enough, the execution time is 80ms for a 2 Mega Pixel image.
And I see that only 1 core is used on the CPU even if I can read on the OpenCV4Tegra description :

In a second chance, I tried with the OpenCV GPU module :

cv::gpu::cvtColor(srcGpu, dstGpu, (int)cv::COLOR_YUV2BGRA_UYVY);

But unfortunately, an error is thrown :

  • "OpenCV Error: Bad flag (parameter or structure field) (Unknown/unsupported color conversion code) in cvtColor, file /hdd/buildbot/slave_jetson_tk1_2/52-O4T-L4T/opencv/modules/gpu/src/color.cpp"

So it seems that this conversion is not implemented in the GPU module. But we can see that the conversion functions exist in NPP library, for example “nppiCbYCr422ToBGR_709HDTV_8u_C2C4R”.

This experience leads me to ask 2 questions:

Is there a way to improve the performance of cvtColor on CPU ?
(by enable the multicore execution for exemple)

Is there any chance to get the COLOR_YUV2BGRA_Y422 conversion implemented in the OpenCV GPU module in the future?

Setup information :
Installation frome the JetPack 1.2
with :

  • OpenCV4Tegra :
  • L4T : 21.4

Thank you in advance for your advice,


Why don’t you use YUV2BGR and add an Alpha channel(all zero) later on GPU function?
Based on gpu/color.cpp source file, CV_YUV2BGRA_UYVY is not implemented.

0,                      // CV_YUV2BGRA_UYVY = 112,

but CV_YUV2BGR is implemented

yuv_to_bgr,             // CV_YUV2BGR      = 84


Thank you for your answer.

My need of COLOR_YUV2BGRA_UYVY is more linked to the suffix ‘_UYVY’ than the ‘BGRA’.
Because the input are not directly 3 channels YUV but they are coded in 422 format familly.
So If I read the suggested file gpu/color.cpp from the community repository, it sounds not good…
In my dream, I hoped that NVIDIA implements more case than the original repository.

Maybe I will try directly with NPP lib…
No more suggestion about the CPU behaviour?

Thank you,

My final suggestion is that you can copy raw UYVY data to YUV buffer. And upload it to gpu:Mat.
In my case, copying UYVY data to YUV buffer and running cvtcolor in GPU take more time than run it in CPU because of low resolution of image (I use 360x288 image). If you run more GPU operation after cvtcolor or use high-resolution image, then it is worth to upload image to GPU and run cvtcolor in GPU.

here is sample code that i used

for(y = 0; y < 288; y++)
	for(x = 0; x < 360; x++)
		Vec3b& val =<Vec3b>(y, x);
		val[0] = pImg[1];
		val[1] = pImg[2];
		val[2] = pImg[0];
		pImg += 4;
	pImg += 720*2;
gpu::cvtColor(matGPUYUV, matGPUBGR, CV_YUV2BGR);;