Reading a frame from RTSP stream in a datastructure(nvbuffer?) and converting it to jpeg

I want to read a frame from an RTSP stream, preferably without using gstreamer pipeline but if that’s the only option, then it’s okay and convert that frame to jpeg in my c++ application. My question is that if it is possible to convert the frame into jpeg before storing it? Also, what would be format of the data structure (I was using cvMat before the need for compression arrived) in this case, if I want to store the frame in memory, and not write to a file?

OpenCV can read rtsp streams via VideoCapture object just like webcams. But it does probably use gstreamer in the background at that point. Using custom gstreamer pipeline (which can be given in VideoCapture as well) will give more control though. You could probably also do the jpeg conversion in the pipeline using nvidia jpeg encoder (nvjpegenc element).

Alternative is to get the cv::Mat via VideoCapture and then use OpenCV for compression. You can use cv::imencode ( to do this in memory.

Yes, I was doing that before but I don’t want to use any cpu resources in that process. I was wondering if mmapi provides any functionality to do so, that is read a frame from RTSP stream and store it in a data structure optimized for nvidia architecture (not sure on which one?) other than cvMat. Another question is that if I use nvjpeg compression in gstreamer pipeline, will i still be able to store the image in cvMat?

You get from rtsp via cpu memory anyway - then you have to copy to GPU either via the gstreamer pipeline or manually via CUDA. Jetson has unified memory, so real copy can sometimes be avoided.

If you want to have max performance I would get rtsp image, decode with nvdecode, encode with nvjpegenc and use that. All of it can be done with gstreamer pipeline. OpenCV doesn’t support compressed cv::Mat, so I’m not sure what would happen if your VideoCapture would return compressed buffer - I think it would break because the size would be different. If you want to have both compressed and uncompressed cv::Mat then you can split the gstreamer pipeline in two (tee element), but then you have to use the C api for Gstreamer to get the two buffers separately.

I think there is API for nvjpeg available as well, so you can compress via GPU if you have the cv::Mat.

1 Like


You may better explain your use case:
What is the encoding of your RTSP stream ?
What kind of processing do you want to perform on frames ?
What are your requirements for storing frames on disk ? Do you want to store processed frames ?

hi @Honey_Patouceul
1- The encoding of the stream is h264 at the moment but it can change
2- I just want to store them in a data structure (array of frames or a ring buffer) in memory in jpeg format
3- I don’t want to store frames on disk, they will just stay in memory for the duration of the application

The easiest way might be to use a gstreamer pipeline for receiving the RTSP stream, decode H264 (or else) and then encode into jpeg.

Here is a quick example with opencv. The video capture uses such a pipeline. Since the original video is only 240x160, I resize with nvvidconv into 1280x720.
Frames are read in jpeg format (you may buffer these) then decoded into BGR for display:

#include <iostream>
#include <opencv2/opencv.hpp>
#include <opencv2/videoio.hpp>
#include <opencv2/imgcodecs.hpp>

int main ()
  const char *gst = "rtspsrc location=rtspt:// ! decodebin ! nvvidconv interpolation-method=5 ! video/x-raw, width=1280, height=720 ! jpegenc ! image/jpeg ! appsink";
  cv::VideoCapture cap (gst, cv::CAP_GSTREAMER);
  if (!cap.isOpened ()) {
     std::cout << "Failed to open camera." << std::endl;
     return (-1);

  std::cout << "Video Capture opened (backend: " << cap.getBackendName() << ")" << std::endl;
  unsigned int width = (unsigned int) cap.get (cv::CAP_PROP_FRAME_WIDTH);
  unsigned int height = (unsigned int) cap.get (cv::CAP_PROP_FRAME_HEIGHT);
  unsigned int fps = (unsigned int) cap.get (cv::CAP_PROP_FPS);
  unsigned int pixels = width * height;
  std::cout << "Frame size : " << width << " x " << height << ", " << pixels << " Pixels @" << fps << " FPS" << std::endl;

  cv::namedWindow("RTSP Preview", cv::WINDOW_AUTOSIZE );
  cv::Mat frame_in;

  double prev = (double) cv::getTickCount ();
  while (1) {
    if (! (frame_in)) {
      std::cout << "Capture read error" << std::endl;
    else {
      double cur = (double) cv::getTickCount ();
      double delta = (cur - prev) / cv::getTickFrequency ();
      std::cout << "delta=" << delta << std::endl;
      prev = cur;

      cv::Mat frame_dec(height, width, CV_8UC3);;
      cv::imdecode(frame_in, cv::IMREAD_COLOR, &frame_dec);
      cv::imshow("RTSP Preview",frame_dec);

  cap.release ();

  return 0;