• Hardware Platform (Jetson / GPU) Jetson • DeepStream Version 6.0 • JetPack Version (valid for Jetson only) 4.6.1
Hello everyone,
I am working on a robotic system that uses 4 cameras, each positioned at 90 degrees, forming a complete 360-degree view around the robot. My pipeline includes a batch-4 nvinfer, followed by a batch-4 nvtracker, and finally a nvmultistreamtiler, ending with an onscreendisplay for visualization.
My main challenge is that the nvtracker operates on each camera separately, resulting in new tracking IDs being assigned when a detected object moves from the view of one camera to another. To try to solve this, I have attempted to reorder the nvtracker and the tiler, placing the tracker after the tiler. However, after this change, the system has stopped functioning correctly: there are no longer any detections, although no error messages appear.
My specific question is: How can I configure the pipeline so that the nvtracker operates on a single image constructed from the 4 cameras, maintaining the consistency of the tracking IDs as an object moves through the different camera views? I have already managed the overlaps and deformations of my cameras, and the transition of views between cameras is smooth.
I appreciate any guidance or suggestions you can offer, especially if someone has faced and resolved a similar challenge.
DeepStream can’t support cross camera tracking. You can check here for cross camera tracking: Metropolis - Multi-camera Tracking Can you share the test video to us? So we can analysis the use case.
I know that DeepStream does not support multi-camera tracking. That’s why I am currently using a system of reassigning global IDs based on the IOU of detections in the overlapping areas. This solution is very unreliable (To add to this, as I mentioned in this post, it’s not possible to access the bbox information from the tracker at moments when there’s no detection, so the instances of overlap are very specific.), and that’s why I wanted to apply tracking to a single image composed of the 4 images, after having managed the overlaps and deformations with nvdewarper. It seems that the tiler is not doing its job well, or something is failing. Perhaps I should manage and transform the metadata manually from each source after nvinfer to the composite image…
Is there any alternative method to compose an image from different sources?
The video from each of the separate video sources? I am currently using the real-time image from my cameras, but I could record a segment if that could help in any way.