Nano not using GPU with gstreamer/python. Slow FPS, dropped frames

I am using the a typical pipeline (see below) to feed my Opencv/Python program frames. My CPU is running at 100% , while my GPU is only 25%. Is there anything I can do to put more work on the GPU?

def gstreamer_pipeline(
return (
"nvarguscamerasrc ! "
"video/x-raw(memory:NVMM), "
"width=(int)%d, height=(int)%d, "
"format=(string)NV12, framerate=(fraction)%d/1 ! "
"nvvidconv flip-method=%d ! "
"video/x-raw, width=(int)%d, height=(int)%d, format=(string)BGRx ! "
"videoconvert ! "
“video/x-raw, format=(string)BGR ! appsink”
% (

The optimal-performance solution is to pass NVMM[video/x-raw(memory:NVMM)] buffers in the pipelines. However, with python, it only accepts CPU[video/x-raw,format=BGR] buffers. This requires copying NVMM buffers to CPU buffers and takes certain CPU loading.

On Jetson Nano with ‘sudo jetson_clcoks’ being executed, it might be good for 1920x1080p30. But the resolution of your usecase is close to 4K, which shall exceed hardware capability.

Suggest you try pure gstreamer or tegra_multimedia_api.

That makes sense.

Is there any way for me to be able to do opencv processing on the device with the CSI camera? I am trying to make a live-streaming system. I need to do a little bit of panning/zooming and text overlays on the stream

There is a sample of gstreamer + cuda::gpuMat. Please refer to

Can I use gpuMat with C++ only? or can I run gpuMat with python?

Not sure but it looks like python uses appsink to get CPU[video/x-raw] buffer. It may not be supported to use gpuMat.

Ok. Thanks for the help. I’m going to use pure gstreamer as you suggested.

Attached a sample for reference. Please follow the steps:
0. Install CUDA, tegra_multimedia_api samples through sdkmanager

  1. Down the script
  2. Add the option and execute the script to build OCV4.1.1
  1. Execute
$ sudo ldconfig -v
  1. Build and run the sample
~/gst_cv_gpumat$ CUDA_VER=10.0 make
Compiling: gst_cv_gpumat.cpp
g++ -I/usr/src/tegra_multimedia_api/include -I/usr/local/cuda-10.0/include `pkg-config --cflags gstreamer-1.0 opencv4)` -c gst_cv_gpumat.cpp -o gst_cv_gpumat.o
Linking: gst_cv_gpumat
g++ -o gst_cv_gpumat gst_cv_gpumat.o -I/usr/src/tegra_multimedia_api/include -I/usr/local/cuda-10.0/include `pkg-config --cflags gstreamer-1.0 opencv4)` -Wall -std=c++11 -L/usr/lib/aarch64-linux-gnu/tegra/ -lEGL -lGLESv2 -L/usr/lib/aarch64-linux-gnu/tegra/ -lcuda -lnvbuf_utils -L/usr/local/cuda-10.0/lib64/ -lcudart  `pkg-config --libs gstreamer-1.0 opencv4`
~/gst_cv_gpumat$ ./gst_cv_gpumat (3.05 KB)

1 Like

Hi @DaneLLL
Is this link use pure gstreamer for decoding? Don’t have this code bottleneck of gstreamer+opencv(copying NVMM buffer to CPU buffer)? I see in this code in lines 123,125 use numpy and opencv lib, Aren’t these lines of code bottleneck like gstreamer+opencv?

and tegra_multimedia_api has python code api?

Hi @LoveNvidia
Your questions look different from this topic. For clearness, please make a new post.

The SH link is not working. Anyway you can provide new instructions for jetpack 4.4?
I am having similar issues, the Gstreamer is using only CPU.
OpenCV 4.4.0

Please check this link:

It is cuda 10.2 in JP4.4, so please set CUDA_VER=10.2