Resize decoded streams using gstreamer+opencv with jetson nano

Hi,
I work on decoder hardware accelerator of jetson nano for multi-streams using gstreamer + opencv.
This worker use NVDEC. my problem is in the resize part. as you know, resizing the frames with cv2.resize() is slow, I want to do this part faster. and the gstreamer also has convertor for resizing, I want to know, If I do the resizing part with gsteamer and then pass to opencv, in your opinion, this way will be fastere rather than cv2.resize ? Is this process done on hardware if I use the gstreamer for resizing?

It depends on your use case. If you just need a lower resolution, better try to capture with this low resolution the earliest possible.
In the following example, the camera is managed by nvarguscamerasrc, so we ask a 1280x720 resolution.
If the resolution you want cannot be acheived by nvarguscamerasrc, or if it comes from another source such as NVDEC, you may use nvvidconv that can do that efficiently with VIC HW. Here we resize to 640x480 while converting from NV12 to BGRx format, so that videoconvert just has to remove the extra 4th byte for each pixel.
Note that some unconventional odd resolutions may fail with nvvidconv. In such case you may try videoscale plugin but it may be much slower.
Last case is if you have to perform processing with both opencv input resolution and a smaller resolution. For this case, in the following example we use GPU to resize into 320x240 with opencv cudawarping.
module.

#include <opencv2/opencv.hpp>
#include <opencv2/videoio.hpp>
#include <opencv2/highgui.hpp>
#include <opencv2/cudawarping.hpp>

int main()
{
    const char* gst =  "nvarguscamerasrc ! video/x-raw(memory:NVMM), format=NV12, width=1280, height=720 !  nvvidconv ! video/x-raw,format=BGRx, width=640, height=480 ! videoconvert ! video/x-raw,format=BGR ! appsink";
    capPtr = new cv::VideoCapture(gst, cv::CAP_GSTREAMER);
    if(!capPtr->isOpened()) {
	std::cout<<"Failed to open camera."<<std::endl;
	return (-1);
    }

    cv::namedWindow("Read frame", cv::WINDOW_AUTOSIZE);
    cv::namedWindow("Resized frame", cv::WINDOW_AUTOSIZE);
    cv::Mat frame_in, frame_out;
    cv::cuda::GpuMat d_in, d_out;
    while(1)
    {
    	if (!capPtr->read(frame_in)) {
		std::cout<<"Capture read error"<<std::endl;
		break;
	}
	else  {
		d_in.upload(frame_in);
		cv::cuda::resize(d_in, d_out, cv::Size(320, 240));
		d_out.download(frame_out);
		cv::imshow("Read frame",frame_in);
		cv::imshow("Resized frame",frame_out);
		if((char)cv::waitKey(1) == (char)27)
			break;
	}	
    }

    capPtr->release();
    delete capPtr;
    return 0;
}

Thanks.
I work with python code.
My input resolution is standard resolution from IP camera(RTSP) likes 640x480 and 1920x1080.
but in the output case I want to have 450x450.
I dont want to use CUDA for this processing, becasue I use cuda for deep learning model.
in your opinion, if i use the following pipeline in opencv, is it efficient?

video/x-raw(memory:NVMM), format=NV12 ! nvvidconv ! video/x-raw,format=BGRx, width=450, height=450 ! videoconvert ! appsink

The pipeline looks ok, although you may specify BGR caps before appsink.

The wrong thing may be the final resolution. Use width and height that divide by 4.
448 or 452 may be better.

1- you suggest to use like this?

video/x-raw(memory:NVMM), format=NV12! nvvidconv ! video/x-raw,format=BGRx, width=450, height=450 ! videoconvert ! video/x-raw,format=BGR ! appsink

2- Why I use H,W be divide by 4?

  1. Yes.
  2. It can work, but I see problems with opencv when sizes are not a multiple of 4. It may only be related to imshow, but I don’t know. As a general rule, the more power of 2 in width and height allow most efficient implementations (as long as it can be divided by 2, it can be split for 2 parallel resources).

Thanks, output size 300x300 is worked.
1- This resizing process is done on cpu in opecv or done with hardware accelerator?
2- Isn’t better to use appsink sync=true?

  1. I’d expect the video scaling by nvvidconv to be done by ISP HW.
  2. I think that sync option of gstreamer pipeline is true by default, but it should be harmless to set it explicitly.

1- ISP HW is supported fot IP camera stream decoding?
2- what’s diffrence between ISP HW and H264 HW accelerator?
If possible explain about ISP HW ? thanks.

How to control frame per second with gstreamer?

Sorry I weirdly worded it. I tend to say ISP instead of IPP for nvvidconv. You may have a look to NvMedia Stack for details (this link is from DRIVE, though).

Best way is using a native sensor framerate if possible.
Otherwise, you may try videorate, but it may generate some CPU load for high resolutions. You would try to use integer division of framerate (ex 30fps → 15, 10, 6 or 5 fps).