Object detection on multipole streams

Hi,

I’m following the hello ai world tutorials.

I want to perform object detection on multipole video streams.

Initially i would like to do this with webcams (for testing), but my final goal is to be able to do it with rtsp streams.

How can i achieve this?
What should be the approach?

Thanks you very much.

Hi @oraday, detectNet from jetson-inference is stateless (except for the tracker), so you can use one detectNet across multiple video streams. You can create multiple videoSource objects, one for each camera, then process each of them with detectNet. Some psuedocode based on detectNet.py:

from jetson_inference import detectNet
from jetson_utils import videoSource

cameras = [
    videoSource("/dev/video0", options={'width': 1280, 'height': 720}),
    videoSource("/dev/video0", options={'width': 1280, 'height': 720}),
]

net = detectNet("ssd-mobilenet-v2")

while True:
    for i in range(len(cameras)):
        img = cameras[i].Capture(timeout=0)  # use timeout=0 so it returns immediately if no image ready from that camera

        if img is None:    # continue to the next camera if no image ready
            continue

        detections = net.Detect(img)

        print(f"Camera {i} detections:', detections)

That said, DeepStream will give you optimal multi-stream performance, batching, better tracking, ect.

Thank you, @dusty_nv.

Are you suggesting that it’s worth investing time in learning the DeepStream SDK if I aim to develop real production applications?

Does working with DeepStream ultimately lead to the creation of more efficient and faster applications compared to working with ‘jetson-inference’?

I’ve started going through the DeepStream SDK information on this site, but I haven’t come across any basic tutorials for beginners. I’m struggling to understand how I should approach this. What are your suggestions?

Additionally, I’ve attempted to develop code directly on my Jetson Xavier NX, but it’s rather slow when searching for information on the internet. Should I do my development on a more powerful computer with DeepStream or jetson-inference and then transfer it to my Xavier NX? What would be the best practical approach for development?

I know these are a bunch of questions, but I would greatly appreciate your insights.

Thanks!

I think it probably depends on how many streams and models you want to run, and how optimized you want the application to be. DeepStream can be a lot faster, is more closely integrated with TAO Toolkit for INT8 models (although jetson-inference does support TAO detection models), and DeepStream supports DLA.

For more complex multi-stream applications, certainly yes. Also it has a lot better tracking (if you are using tracking). jetson-inference predates DeepStream and started as a tutorial to make it easy to learn about DNNs (hence it’s aim is to provide the easiest-possible C++/Python APIs for doing 1x stream in realtime on Jetson Nano)

I’d suggest the DeepStream Python examples, and looking at the notebooks in this DLI course:

If you only want to do a couple of streams, you can always try my approach above with jetson-inference as it might be quick for you to code. However ultimately if you want it to scale bigger, you might want to invest time into learning DeepStream.

Personally, I always SSH into my Jetson boards. I use MobaXTerm (SSH client) on Windows, which has a nice remote file browser. Also, I know people use Visual Studio Code remotely over SSH. So you are still developing/compiling “on” the Jetson, it’s just being done “from” your PC.

The exception is when you are getting started with camera stuff, it can be nice to still have a display attached to your Jetson so you can easily view the video streams (until you get into more remote streaming stuff like RTP/RTSP/WebRTC)

Hope that helps!

Thank you very much dusty.

What is your opinion on the graph composer compared to direct python coding with deep-stream?

Would it be easier for beginners to start with the graph composer to play with the basic stuff?

Would eventually I will find my self using deep-stream with python (and not with the graph composer)?

I haven’t personally tried the Graph Composer but it seems like an awesome tool, and if you were starting new probably a good idea to check it out!

Hi again Dusty,

I tried the following code.

I’m getting detection boxes overlays on both display windows, regardless what was the source of the detection.

i.e. I want to get detection from the first videoSource displayed only on the relevant videoOutput (and not both outputs).

I tried few things that didn’t work. So I decided to ask you how to approach this.

Thanks!

from jetson_inference import detectNet
from jetson_utils import videoSource, videoOutput
import time

start = time.time()
fpsFilt=0

net = detectNet("ssd-mobilenet-v2", threshold=0.35)

net.SetTrackingEnabled(True)
net.SetTrackingParams(minFrames=3, dropFrames=15, overlapThreshold=0.5)


display = [
    videoOutput("display://0"),
    videoOutput("display://0"),
]

cameras = [
    videoSource("/dev/video0"),
    videoSource("/dev/video2"),
]

isStreaming = True

while isStreaming:

    for i in range(len(cameras)):

        if not display[i].IsStreaming():
            isStreaming = False

        img = cameras[i].Capture(timeout=0)  # use timeout=0 so it returns immediately if no image ready from that camera
        
        if img is None:    # continue to the next camera if no image ready
            continue

        detections = net.Detect(img, overlay='box,labels,track,conf')

        for detection in detections:
            class_name = net.GetClassDesc(detection.ClassID)
            print(f"Detected '{class_name}'")
            if detection.TrackStatus >= 0:  # actively tracking
                print(f"object {detection.TrackID} at ({detection.Left}, {detection.Top}) has been tracked for {detection.TrackFrames} frames")
            else:  # if tracking was lost, this object will be dropped the next frame
                print(f"object {detection.TrackID} has lost tracking")  

        print(detections)

        display[i].Render(img)
    
    end = time.time()
    dt = end - start
    fps = 1 / dt
    fpsFilt = .9*fpsFilt + .1*fps

    print("Global iteration time {:.0f} FPS".format(fpsFilt))

    start = time.time()

Hi Dusty,

Any comment on my previous post?

@oraday if you are using tracking, use two separate detectNet objects (one for each camera)

I tried the attached code.

It works most of the time, but once in a while i get an error:

“[gstreamer] gstCamera::Capture() – an error occurred retrieving the next image buffer
Traceback (most recent call last):
File “/home/nx/Desktop/pyPro/detection-tracking-multiple-streams.py”, line 38, in
img = cameras[i].Capture()
Exception: jetson.utils – videoSource failed to capture image”.

Any idea why, and what can be done to avoid or override this?

Thanks!

from jetson_inference import detectNet
from jetson_utils import videoSource, videoOutput
import time

start = time.time()
fpsFilt=0

net = [detectNet("ssd-mobilenet-v2", threshold=0.35),
       detectNet("ssd-mobilenet-v2", threshold=0.35)] 

net[0].SetTrackingEnabled(True)
net[0].SetTrackingParams(minFrames=3, dropFrames=15, overlapThreshold=0.5)

net[1].SetTrackingEnabled(True)
net[1].SetTrackingParams(minFrames=3, dropFrames=15, overlapThreshold=0.5)


display = [
    videoOutput("display://0"),
    videoOutput("display://0"),
]

cameras = [
    videoSource("/dev/video0"),
    videoSource("/dev/video2"),
]

isStreaming = True

while isStreaming:

    for i in range(len(cameras)):

        if not display[i].IsStreaming():
            isStreaming = False

        # img = cameras[i].Capture(timeout=0)  # use timeout=0 so it returns immediately if no image ready from that camera
        img = cameras[i].Capture()
        
        if img is None:    # continue to the next camera if no image ready
            continue

        detections = net[i].Detect(img, overlay='box,labels,track,conf')

        for detection in detections:
            class_name = net[i].GetClassDesc(detection.ClassID)
            print(f"Detected '{class_name}'")
            if detection.TrackStatus >= 0:  # actively tracking
                print(f"object {detection.TrackID} at ({detection.Left}, {detection.Top}) has been tracked for {detection.TrackFrames} frames")
            else:  # if tracking was lost, this object will be dropped the next frame
                print(f"object {detection.TrackID} has lost tracking")  

        print(detections)

        display[i].Render(img)
    
    end = time.time()
    dt = end - start
    fps = 1 / dt
    fpsFilt = .9*fpsFilt + .1*fps

    print("Global iteration time {:.0f} FPS".format(fpsFilt))

    start = time.time()

Any idea what might be the problem and how to bypass it such it won’t break the code?

@oraday not sure - is there other debug output from the log? Could be a camera issue or connection dropping a frame. Does it continue capturing frames after that? If so, I would just let it continue running.

There is no other log information.
And it’s not capturing frames. It breaks the code and stops.
I would be happy if it would not break the code and continue running after frames drop.
Any way to achieve this?
Thanks again!

Hi. @dusty_nv
I just got the same problem, it could run well for a while(maybe 2 min),but shutdown with the error “[gstreamer] gstCamera::Capture() – an error occurred retrieving the next image buffer
Traceback”. I have tried to move the code “/dev/video0” out of the .py file and add after the command like “python3 xxx.py /dev/video0”. Amazingly it remains 4 minutes more before the error occured. I can’t figure out the reason and I wonder if there is a way that can automatically reboot the file when the error appears so that I can keep the program remaining.
Thanks

You can catch the exception from videoSource.Capture() that gets thrown when an error occurs. If it’s a normal timeout, None will be returned from Capture()

You might want to try temporarily disabling the DNN stuff to see if that’s related, or if it’s a camera issue. Is it always the same camera dropping connection? Does that camera drop out even when it’s the only camera being captured?

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.