Hi all,
I have a 4x 5MP camera solution streaming using the Xavier 32GB. I have implemented CLAHE through nvivafilter, however I’m not able to get the wanted 24FPS per camera, I benchmark 13-14 FPS so I’m looking into how to improve this. Without the nvivafilter part I manage 24FPS with no problem.
Relevant part of my pipeline looks like the following when configured:
gst-launch-1.0 v4l2src device=/dev/video0 ! ‘video/x-raw, format=BGRx’ ! nvvidconv ! ‘video/x-raw(memory:NVMM), format=NV12’ ! nvivafilter customer-lib-name=“libnviva_clahe.so” cuda-process=true ! ‘video/x-raw(memory:NVMM), format=(string)RGBA’ ! nvvidconv ! ‘video/x-raw(memory:NVMM), format=NV12’ !
The relevant part of the nvivafilter gpu_process is the following:
static Ptr<cv::cuda::CLAHE> clahe;
static GpuMat gpuframe_3channel(height, width, CV_8UC3);
vector yuv_planes(3);if (!clahe) {
clahe = cv::cuda::createCLAHE(2.5, Size(6,6));
}
GpuMat d_mat(height, width, CV_8UC4, pdata);
cv::cuda::cvtColor(d_mat, gpuframe_3channel, CV_BGR2YUV, 3);
cv::cuda::split(gpuframe_3channel, yuv_planes);
clahe->apply(yuv_planes[0], yuv_planes[0]);
cv::cuda::merge(yuv_planes, gpuframe_3channel);
cv::cuda::cvtColor(gpuframe_3channel, d_mat, CV_YUV2BGR, 4);
I’m looking for ways to optimize this pipeline. F.ex. I have to go through NV12->RBGA->YUV-RBGA->NV12 to make this work. It would be much better to do NV12->YUV->NV12, but I can’t seem to figure out how to process NV12 into OpenCV and back again… Other suggestions on how to improve framerate here would be much appreciated! :)