Next step to get faster with Xavier? USB V4L2 camera,opencv 4 cuda build, Dlib, yolo

erence · April 15, 2020, 4:16pm

I am an ophthalmologist, and wrote a code in for python for investigating eye movements.

At my desktop:

The code (python, opencv 4.3 with cuda and cudnn, dlib 19 and tiny yolov3),
works at my computer(i5-6600, 3.5GHZ cpu, 32 Gb ram and 1080ti gtx)
using the USB 4k camera and at these average speed:

at 3840-2160 resolution:
-when only dlib works : 0.085 sec (about 12 FPS)
-when tiny yolo v3 is added for both eyes seperately : 0.22 sec (about 4.5 FPS)

at 1920x1080 resolution:
-when only dlib works : 0.036 sec (about 6.7 FPS)
-when all works (tiny yolo v3 for both eyes separately : 0.17 sec (about 5.8 FPS)

It seems that 5 or even 4 FPS is ok for my work.

At Jetson Xavier:

Now I take my Jetson Xavier (MAXN), same camera with the same code. OpenCV 4.3 build from scratch with cuda and dnn, and dlib_use_cuda true and gst pipeline as:
gst_url=“v4l2src device=/dev/video0 do-timestamp=false ! image/jpeg, width=3840, height=2160, framerate=30/1 ! jpegdec ! videoconvert ! appsink drop=True max-lateness= 1 enable-last-sample =True max-buffers=1”
or
gst_url=“v4l2src device=/dev/video0 ! video/x-raw,framerate=30/1! videoscale ! videoconvert ! appsink”
or
gst_url=“v4l2src device=/dev/video0 ! jpegdec ! videoconvert ! appsink”. without significant difference:

at 3840-2160 resolution:
-when only dlib works : 0.068 sec (about 14.7 FPS)
-when tiny yolo v3 is added for both eyes separately : 0.32 sec (about 3 FPS)

at 1920x1080 resolution:
-when only dlib works : 0.029 sec (about 6.7 FPS)
-when all works (tiny yolo v3 for both eyes separately : 0.27 sec (about 3.7 FPS)

Additionally I have a serious camera lag problem. I see the fps good, but there is a lag in camera : appsink drop=True max-lateness= 1 enable-last-sample =True max-buffers=1 does not have a significant effect, cap.set(cv2.CAP_PROP_FPS,5 has minimal effect. Threading in OpenCV capture had no significant effect too.

Options:

1- Python and OpenCV is amateur. Nvidia is a more professional company. As a doctor you should go and do your job and leave this to professional hands, let the code be written from scratch with c+, gstreamer tensorrt and more fancy nvidia staff.

2- Change your camera. A Csi camera would work nvarguscamera (instead of v42l) and you might easily apply the NVMM so that you can use GPU and all would be faster without your buffer lag. Of course you need a csi carrier board. Luckily there are cheap csi camera carrier boards for single camera.

3-Optimizing the pipeline in gst streamer is enough. Even beginning with v42l, you can lead memory to gpu and work faster in jetson family embedded

4-Change opencv settings and opencv code… or build it with additional flags (additional to cuda,cudnn, v42l etc add openGL support or something else)

5- Try to work for a faster yolo, that is the bottleneck. Learn nvidia stuff such as deepstream sdk, tensor rt. Besides, this is eventually a Nvidia forum, you may not ask other stuff, such as gstreamer or opencv, as there are legal issues?

6- You forget the main thing…Which is…

Which road should I go ?? I would appreciate your help. Best regards.

rsc44 · April 25, 2020, 9:31am

You’re already doing the fancy stuff my friend.

Blockquote
gst_url=“v4l2src device=/dev/video0 do-timestamp=false ! image/jpeg, width=3840, height=2160, framerate=30/1 ! jpegdec ! videoconvert ! appsink drop=True max-lateness= 1 enable-last-sample =True max-buffers=1”
or
gst_url=“v4l2src device=/dev/video0 ! video/x-raw,framerate=30/1! videoscale ! videoconvert ! appsink”
or
gst_url=“v4l2src device=/dev/video0 ! jpegdec ! videoconvert ! appsink”. without significant difference:

^^ You are using Gstreamer above ^^

I recommend you do two things on the xavier.

change your command line to this

gst_url=“v4l2src device=/dev/video0 ! video/x-raw(memory:NVMM), width=1280, height=720,format=I420, framerate=30/1 ! nvvideoconv flip-method=0 ! video/x-raw, format=BGRx ! videoconvert ! video/x-raw, format=BGR ! appsink sync=false drop=true max-buffers=10"

It should work i do not have a xavier on hand and i’m not sure what your processing looks like so if it requires jpegs as an input (it shouldnt its slower that way) then you’ll want to put nvjpegdec in the pipeline before nvvideoconvert.

Go have a looksy at the deepstream sdk, download it and use the yolo plugin. To get the most out of your jetson xavier this is what i recommend. The sdk is user friendly and has alot of community support, plus you seem informed enough to take on the more “advanced” nvidia stuff.

After you get the yolo-plugin working with your camera-setup you can add in dlib via an appsrc/appsink pipeline or (more advanced route) step out of the ol’ comfort zone and write a custom classifier using the gst-dsexample plugin (provided in the deepstream sdk)…

If you go the more advanced route you’ll have excellent performance and on the xavier you would easily be able to reach 30fps.

Anyways best of luck to you my friend, don’t give up.

You’re an inspiration to us all for being an ophthalmologist and a developer at the same time, it really is impressive.

rsc44 · April 25, 2020, 9:43am

Option #3

Stand on the shoulders of giants

i’m not sure if you’ve seen this yet or if it’s like anything you are doing but here you go…

erence · May 2, 2020, 8:20pm

Thank you for the suggestions.

gst_url=“v4l2src device=/dev/video0 ! video/x-raw(memory:NVMM), width=1280, height=720,format=I420, framerate=30/1 ! nvvideoconv flip-method=0 ! video/x-raw, format=BGRx ! videoconvert ! video/x-raw, format=BGR ! appsink sync=false drop=true max-buffers=10"

No this (or some varieates of this) do not work with my ELP IMX317 cam.

Thanks again for the second suggestion + motivating and thank NVIDIA for the latest 4.4 Jetson SDK pack with deepstream 5.0 supporting python . At least I can see that the camera can work without lag.
Though I could not extract the working gst-pipeline that the deepstream-python was using… Would be nice if it could print the pipeline:)

The deepstream-python examples work, but can I put my own tiny_yolo.cfg and . weight into deepstream? I guess I must convert them to onnx and then to tensorRT. Is that right? Can I use my yolo weights in deepstream. How can I then integrate this into my python code? Any way to receive image with something like image = cap.read() so that I can forward it to my gui with pyqt5??
Sorry for my questions, but these might serve for those enthiusiasts like me trying to integrate python codes into jetsons.

erence · May 2, 2020, 8:32pm

Thanks for option 3 :), I have seen that great project. But I only need the camera pipeline-dlib-my_trained_yolo.weight work as it should,as the code is already working perfectly in a desktop and already get the exact results that I want. So somehow I have to integrate my python code into jetson, if possible. Is it in the end possible without using opencv but python?

rsc44 · May 8, 2020, 6:30pm

Sorry for the late reply,

I will take some time later and try to get a parse launch string put together that will work for you.

You should be able to just slap your own yolo-tiny weights and tiny-yolo.cfg into the config file under the object_detectorYolo directory in the DS5.0 repository. It should run just fine with no issues. You can convert your custom yolo model to onnx then to trt if you want to see a slight performance boost. However since you are only using 1 camera, it probably is not needed.

To put the videofeed coming out of DS5.0 into your own gui should be rather straight forward. I can walk you through it. If you can share with me part of you gui i can show you exactly what to do and where to place everything.

Essentially you’ll be replacing the nvveglssink in deepstream-app with a appsink, but there are some things that we will need to do to pull the buffers from appsink correctly. Thats what ill be showing you after you share the pyqt5 project.

Topic		Replies	Views
Using YOLO on Jetson TX2 and Econ System Cameras Jetson TX2	8	7288	December 14, 2017
Using TX1 Camera with Xavier and Python (Solved) Jetson AGX Xavier	7	1998	October 18, 2021
Surveillance System Use Case Jetson Xavier NX board-design	28	3231	October 18, 2021
Realtime pupil detection with Jetson Nano and DeepLabCut Jetson Projects	4	6875	April 26, 2022
Opencv Face Detection Poor Performance with jetson nano Jetson Nano opencv	51	14188	October 14, 2021
Python OpenCV video capture problem Jetson Xavier NX opencv , gstreamer , python	11	18742	October 18, 2021
Jetpack 4.5: USB Camera does not work in Cheese and opencv 4.1 Jetson Xavier NX camera , opencv , usb , gstreamer , nvbugs	18	6344	September 15, 2021
Using TX2 DevKit Camera on Xavier Jetson AGX Xavier	7	1816	October 18, 2021
GPU Acceleration Support for OpenCV Gstreamer Pipeline Jetson Xavier NX opencv , gstreamer	17	8025	October 18, 2021
Xavier AGX RTSP MJPEG Stream and Decode Jetson AGX Xavier rtsp	15	1341	January 18, 2023

Next step to get faster with Xavier? USB V4L2 camera,opencv 4 cuda build, Dlib, yolo

At my desktop:

At Jetson Xavier:

Options:

Related topics