Real time parking and traffic management

hrvoje.bilic · May 1, 2024, 6:10am

Hello all,

During Onboarding Q&A session it was recommended that we use forum to explain what do we want to develop so that we could get feedback from developers working on a similar project. So, I will use the opportunity to describe our project.

Proof of concept is required where people and vehicles are needed to be detected in video stream from existing cameras. Detection must include stationary and moving objects. From their location in video stream, we will calculate their geo location and display each detected object as an icon on a map. In short, proof of concept is a groundwork for city AI parking and traffic management.

Our company strong point is location retrieval from any type of video stream and video camera. For this we use our own lens and camera distortion model. Here is the video of a plugin that showcase our technology implemented in Milestone XProtect video surveillance system. Recording is from one of our online meetings. There are couple of features to notice here that are unique to our technology. First is the usage of jpeg image as a map to control PTZ camera direction. Second one is the usage of live video stream to control PTZ camera direction. Moving a mouse over live video stream directs PTZ camera in that location. Two live streams are used in this example, to control one PTZ camera. And the last one is the AI integration. When vehicles are detected in video streams they geo location is calculated and they are displayed on map as icons. When person is detected (in this case me) icon is also displayed on a map and PTZ camera and is activated for automatic tracking.

My idea is to use NVIDIA Metropolis service on Jetson, where several video streams will be used as inputs and where Intelligent Video Analysis will be used for people and vehicle detection in video streams. From detected locations in pixels we will calculate object geo locations and stream that as metadata to end used (cloud or in house server app). Based upon this, real time data, server or cloud app will do it’s thing (tracking, heat maps, etc).

Feel free to contact me for any questions and if interested www.3visiond.com has some more info.

Best regards,

Hrvoje Bilić

kesong · May 2, 2024, 3:32am

Thanks for your sharing. Do you want multiple camera tracking on Jetson or dGPU on server? We have MTMC for dGPU currently: NVIDIA Multi-Camera Tracking AI Workflow

hrvoje.bilic · May 2, 2024, 1:04pm

Hello Kesong,
we want to use Jetson for detection only. Just like in multicamera tracking example here (Tracking Objects Across Multiple Cameras Made Easy with Metropolis Microservices | NVIDIA On-Demand). Our system architecture is almost the same like the one from example in video, but we do one thing differently.
In the area of mapping the pixels with a physical world we are actually calculating the location of detected object in cm in respect to camera mounting location. This enables us to calculate detected object width and height in cm.
Furthermore, outdoor applications are requiring 3D space to be embedded with video stream. Our background is in entertainment industry where we used ultrawide angle cameras to track the stage performer with moving heads. Most of the times there are stairs on the stage and those stairs are seen in video stream when we do a tracking. Once stage performer starts to walk on the stairs tracking height must be adjusted automatically. This means that vision system must be able to combine 3D data over live video stream. We have managed to invent new camera model that enable us to overlay 3D data over video stream.
Why am I talking about 3D? It seems to me that embedding 3D data of detected object is the right way to go. For example, ONVIF standard supports geo location for a number of years now, and only recently, new cameras are arriving on the market with the possibility to send metadata with geo location tags. One can assume that, in future, video cameras or some other device/service will export location data for each detected object either in a form of cm or in a form of lat and long. So why not embed that data for right now. If the data are there, it will be used, if data is not there, it will not be used.
Best regards,
Hrvoje

kesong · May 7, 2024, 3:23am

Thanks for your sharing. Seems your use case need 3D object detection. I am woundering how camera get the geo? Maybe there is model to predict it within the camera.

hrvoje.bilic · May 8, 2024, 7:45am

No, we are calculating geo from pixels. It this video surveillance example old discontinued Axis model 3707-PE with no AI is used, so calculation is done on a PC side.
I can write a book on this topic, but in short, I have invented a system how to transform real world video camera into pinhole camera model. Any camera, panoramic, or camera with ultra wide angle lenses, ever thermal cameras can be transform to pinhole camera model. Afterwards it is all about trigonometry and geometry. No chessboard and intrinsic and extrinsic camera parameters are used and we don’t even “undistort” the image. We undistort only the point of interest, for example detected person bounding box bottom middle point is the location of persons feet, we just undistort that point.
Video posted here gives the example on how do we calculate geo coordinates. In 3:13 I am tracking a person’s feet with mouse pointer over camera live video feed from Axis 3707. Notice that image is distorted (line over the garage doors shows the level of distortion). Mouse pointer location in pixels is used to calculate real world 3D coordinate of a lady feet, in reference to camera mounting position. Since video camera geo location is known, geo location of the person is easily calculated. So, first we calculate the 3D location of the person from its location in video stream in centimetres, then we calculate its geo location, and then, since we know geo location of PTZ camera we are calculating the amount of pan, tilt and zoom to direct PTZ camera in person location.
I have invented this system simply because it was not possible to use chessboard method for camera calibration in our application that was tracking the stage performer with moving heads. This system have enabled us to use GoPro Max Lens Mode for stage tracking and to achieve tracking accuracy more than 99%. There are couple of video on our page https://visionspot.eu/ that shows how the same camera model is working for stage tracking.
We can use any type of camera, transfer it to pinhole camera model, and used for tracking or distance measuring, or whatever…

Best regards,
Hrvoje

Topic		Replies	Views
Jetson Nano and Pedestrian detection Jetson Nano neural-network-framework	13	3591	October 18, 2021
MTMC microservice for Jetson Metropolis Microservices for Jetson	12	706	February 19, 2025
[UPDATE 2/25/2021 to include hand pose] Real time human pose estimation on Jetson Nano (22FPS) Jetson Projects	27	20773	April 16, 2025
Optimizing traffic light cycles by combining local analysis (NVIDIA Jetson, Metropolis, DeepStream) Metropolis Microservices for Jetson jetson , deepstream	3	49	March 10, 2025
How to track and re-identificate human across multicamera scenario using RTSP Metropolis Microservices for Jetson jetson , deepstream , metrop	7	72	February 28, 2025
Object detection on multipole streams Jetson Xavier NX	15	843	September 20, 2023
Question About Final Project Complexity Jetson Orin Nano	2	156	May 27, 2024
Mitigating Occlusions in Visual Perception Using Single-View 3D Tracking in NVIDIA DeepStream Technical Blog	2	134	March 22, 2025
How to make a pedestrian counter Jetson TX1	2	1917	October 18, 2021
Generate Traffic Insights Using YOLOv8 and NVIDIA JetPack 6.0 Technical Blog	1	104	June 18, 2024

Real time parking and traffic management

Related topics