[Argus] obtaining an RGB (CUDA) image

Hello,

Following some examples in the tegra_multimedia_api/argus folder, I am able to capture images in YUV420 format from a CSI camera. Using

...
iEventProvider->waitForEvents(queue.get());
UniqueObj<Frame> frame(iFrameConsumer->acquireFrame());
...

Then using

auto *iNativeBuffer = interface_cast<IImageNativeBuffer>(iFrame->getImage());

followed by a combination of createNvBuffer, NvBufferMemMap, and NvBufferMemSyncForCpu I get this image in CPU memory to do the color conversion to RGB using OpenCV. This works but it is very slow.

Can someone please explain me how to use hardware-accelerated API calls to do the color conversion?
I would like to pre-allocate a CUDA array of the appropriate size and convert incoming frames to RGB color space.

I would like to end up with RGB (not ARGB or RGBA).

Please note that I have already looked at NvVideoConverter as it is mentioned in a few other topics. I just don’t see how to use it exactly. Could someone perhaps provide me with some help or a patch of one of the simpler examples (oneShot or yuvJpeg) to demonstrate this?

Hi Beerend,
For Argus + OpenCV, please refer to
https://devtalk.nvidia.com/default/topic/1037863/jetson-tx2/argus-and-opencv/post/5273400/#5273400

HW converter(NvVideoConverter) supports 32-byte RGBA and BGRx but not 24-byte BGR. You can do
1 Re-sample RGBA to BGR via CUDA. Please refer to tegra_multimedia_api\samples\backend
2 NV12/YUV420 to BGR conversion via CUDA. Please refer to below sample code:
https://github.com/dusty-nv/jetson-inference/blob/master/util/cuda/cudaYUV-NV12.cu

Hi DaneLLL,

Thanks for your answer. I will try out your first suggestion using the createNvBuffer function. I have tried something similar before but couldn’t get it to work.

In the end I would like to run TensorRT inference on this frame (or a batch of them) so I guess it would be preferable to keep the data in GPU memory. I am looking at this example I found:

https://github.com/vat-nvidia/deepstream-plugins/tree/master/sources/gst-yoloplugin/yoloplugin_lib

. There seems to be still quite some work on CPU in the dsPreProcessBatchInput function but resize and copyMakeBorder have their equivalents in the cv::cuda namespace so I guess it should also be possible to run those on GPU?

Do you have any thoughts or suggestions on this?

hi Beerend,
tegra_multimedia_api\samples\frontend is the sample demonstrating Argus -> TensorRT. Please refer to it.