Convert float* from CaptureRGBA to cv::Mat for further processing. Jetson-Inference Live Camera Detection

[EDIT] Solved it! I implemented it the other way round with this helpful comment:

Hello everybody,

i have been working with this tutorial:

My Goal is: Parallel Object Detection with two Cameras and also CannyEdge Detection. I have succesfully implemented it in Python where i used:

imgCV = jetson.utils.cudaToNumpy(imgRGBA,width,height,4)

To convert the imgRGBA Format (from the function: CaptureRGBA(&imgRGBA0, 1000) ) for the usage with OpenCV.

Is there a similar function for this conversion in C++? I have been searching the whole day, any help would be grateful.

So Basically i want to convert the imgRGBA from glDisplay:

camera0->CaptureRGBA(&imgRGBA0, 1000)

I would even try to implement a converison by Hand but I’m not able to get the values from imgRGBA0.

Just Opening it with Imshow obviously does not work an throws the error:

note:   no known conversion for argument 2 from ‘float*’ to ‘cv::InputArray {aka const cv::_InputArray&}’

I have also tried to convert it with this function fromt the utils (cudaRGB.h) but it does not work either:

cudaRGBA32ToRGB8(imgRGBA0, output, width,height );

with error Msg:

note:   no known conversion for argument 1 from ‘float*’ to ‘float4*’

[EDIT] As if forgot to mention: Parallel Object Detection with 2 Camera Streams works on C++ but I also want to display a Canny Edge Detection in a third window , which ist why i need the Connection to OpenCV (or just simple access to the pixel values)

[EDIT] Solved it! I implemented it the other way round with this helpful comment:

Any help would be appreciated!


I believe you are on the right track here, you just need to cast your pointer to (float4*). And probably the output pointer to (uchar3*), like so:

cudaRGBA32ToRGB8((float4*)imgRGBA0, (uchar3*)output, width,height );

By default, the camera video frames reside only in GPU memory. However, you can make them reside in shared CPU/GPU zeroCopy memory by passing true for the optional 3rd parameter of gstCamera::CaptureRGBA() - the camera frames will then be accessible from CPU. Just make sure the GPU has completed it’s kernels before trying to access it from the CPU (i.e. insert a call to cudaDeviceSynchronize() before accessing from CPU)

Hi Dusty

I have the exact same problem, i have tried following your guide but cannot figure out how to implement the cudaDeviceSynchronize(). Is there a way of doing this in a simple c++ file( function in the jetson-utils) or do i have to write a cuda file?

any help on this is much appreciated!
Kind Regards, Esben

Hi @esbenkt, you can call cudaDeviceSynchronize() from a simple C++ file, because it is a CUDA runtime C API (as opposed to a CUDA kernel which needs compiled for GPU in a .cu file)

Just have a #include <cuda_runtime.h> in your cpp file first.

Hi Dusty

Thanks for the quick response!
I’m getting "undefined reference to ‘CudaDeviceSynchronize’ ". I am using g++ to compile my main.cpp file with #include <cuda_runtime.h> compiled with -I/usr/local/cuda/include -L/usr/local/cuda/lib64 -lcuda -lcudart -lcublas - lcurand. Im running it on Jetson Xavier NX with the jetpack installed and setup, but am i missing something or compiling it wrong? My search on the web for the this error only yield results of using the nvcc compiler on cu files.

Hmm, ok - can you try using nvcc to compile your cpp file? nvcc can compile both.

Admittedly, I usually use cmake which automatically handles which CUDA libraries need linked.

Hi Dusty

Managed to get it work with cmake, thanks for the help