RTSP source with Jetson TX2

I have a trained model to perform object detection on images (using TensorRT) that I would like to test on video streams, coming from one, or possibly more, RTSP sources using Jetson TX2.

I’ve looked at:

But I’m having an hard time understanding how to capture frames from the RTSP stream and giving them as input to my model since I’m not really familiar with Gstreamer nor with video in general.

Could someone please refer to me to some good resources/examples?
What’s the most efficient way to achieve this?

Thank you

Basic RTSP stream decoding with gstreamer pipeline would look like this:

gst-launch-1.0 rtspsrc location=rtsp:// ! application/x-rtp, media=video ! decodebin ! video/x-raw, format=NV12 ! videoconvert ! xvimagesink

If this works, you may try to change the pipeline string here to something like

ss << "rtspsrc location=rtsp:// ! application/x-rtp, media=video ! ";
ss << "decodebin ! video/x-raw, format=NV12 ! videoconvert ! ";
ss << "video/x-raw, width=(int)" << mWidth << ", height=(int)" << mHeight << ", "; 
ss << "format=RGB ! videoconvert ! video/x-raw, format=RGB ! videoconvert !";
ss << "appsink name=mysink";

Of course, you would change the RTSP source address if your NN is not trained for detecting flowers, birds and rabbits ;-)

Thanks for your reply, is the conversion to RGB via videoconvert done in GPU or CPU?

[s]I think this pipeline uses only CPUs.

If the RTP stream has always H264 or H265 compression, you may use plugin omxh264dec or omxh265dec that would use a HW decoder.
For example, if you receive H264 you may use a pipeline like this for checking first in a terminal and display into GUI:

gst-launch-1.0 rtspsrc location=rtsp:// ! rtph264depay ! h264parse ! omxh264dec ! videoconvert ! xvimagesink

If it works you would just change the decodebin part of the pipeline string and replace with

rtph264depay ! h264parse ! omxh264dec


[EDIT: Sorry, I spoke to fast. decodebin indeed uses NVDEC.]

Hi nik0lai,
We have tegra_multimedia_api and gstreamer supported. Which one do you choose to run your TensorRT model?

I’ve been using gstreamer so far and right now it’s working okay.
My problem is that I’m not able to scale efficiently with the number of input rtsp stream.
How can I improve my performance?

Hi nik0lai,
Have you checked https://developer.nvidia.com/deepstream-jetson ? It is implemented in gstreamer.

thanks for you answer.
Yes I have, but my model has custom layers that I believe are not supported by DeepStream right now. Am I correct? Do you know when will they be supported?


Hi nik0lai,
Please check

thanks for the pointer. Unfortunately, I’m not able to run the Deepstream example, I get the following error:

nvgstiva-app -c config/nvgstiva-app_yolo_config.txt
------------>  -----------------
CPU usage measurement requires governor to be set to performance
Using winsys: x11 
------------> 0,0.0,1.0,0.0:1,0.0,1.0,1.0:2,0.0,0.0,1.0:3,1.0,1.0,0.0 -----------------
File does not exist : sources/lib/models/yolov3-kFLOAT-batch1.engine
Unable to find cached TensorRT engine for network : yolov3 precision : kFLOAT and batch size :1
Creating a new TensorRT Engine
Loading pre-trained weights...
Loading complete!
Total Number of weights read : 62001757
Unsupported layer type --> ""button" aria-label="Switch branches or tags" aria-expanded="false" aria-haspopup="true">"
nvgstiva-app: yolo.cpp:390: void Yolo::createYOLOEngine(int, std::__cxx11::string, std::__cxx11::string, std::__cxx11::string, nvinfer1::DataType, Int8EntropyCalibrator*): Assertion `0' failed.
Aborted (core dumped)

All previous steps in the README were successful

Am I doing something wrong?

Fixed. It was a wrong cfg file. Thanks.