Query on Multi-Camera Calibration for MV3DT

Hardware Platform: GPU (NVIDIA RTX A6000 48GB)

DeepStream Version: 9.0

JetPack Version: N/A (not Jetson)

TensorRT Version: N/A

NVIDIA GPU Driver Version: 580.126.09

Issue Type: Questions

How to reproduce the issue:

I am using AutoMagicCalib 2.0.0 (nvcr.io/nvidia/auto-magic-calib:2.0.0) to calibrate two store surveillance cameras. My store layout is a long narrow space (A->B->C), with Camera 0 mounted at position A and Camera 1 at position B, both pointing toward C. The cameras have limited spatial overlap (only the B->C zone).

Calibration consistently fails with the following error in multi_view_calib_*.log:

INFO - tracklet_matches: [((0, 55), (1, 162)), ((0, 10), (1, 63))]
ERROR - Number of matching tracklet candidates is less than 3

Only 2 tracklet pairs are matched across cameras. The RANSAC step (min_samples: 3) requires at least 3 matches — a geometric minimum for R,t estimation — so calibration cannot proceed even after I lowered min_matched_tracklets to 2 in mv_amc_config.yaml.

I also enabled enable_manual_adjustment: true and provided 4+ manual alignment point sets in Step 4, but calibration still fails because the base tracklet matching is insufficient.

Single-view outputs are produced (per-camera camInfo_hyper_00.yaml and camInfo_hyper_00_opencv.yaml), but multi-view relative pose is not computed.

Background context:

My ultimate goal is to adapt the official deepstream-tracker-3d-multi-view 4-camera sample to my own 2-camera setup. I am trying to produce the equivalent input files for a 2-camera configuration. Looking at the sample dataset structure, MV3DT requires:

  • camInfo/ — per-camera YAML files containing projectionMatrix_3x4_w2p

  • transforms.yml — coordinate transformation definitions

  • videos/ — input video files

I chose AutoMagicCalib as the calibration tool, but I am not sure if it is the right tool for this purpose, or if there is a better recommended workflow.

My questions:

  1. Is there a recommended workflow for cameras with very limited overlap (both cameras facing the same direction in a narrow space), where automatic tracklet matching cannot find ≥3 candidates?

  2. Can the manually provided alignment points (Step 4) substitute for tracklet matching to bootstrap the R,t estimation, even when tracklet matching fails entirely?

  3. My downstream goal is MV3DT. The single-view camInfo_hyper_00.yaml outputs contain per-camera intrinsics and extrinsics. Are these outputs compatible with MV3DT’s required projectionMatrix_3x4_w2p format? If so, what additional steps are needed to generate the multi-view relative pose?

  4. How is transforms.yml generated for MV3DT? Is it an output of AutoMagicCalib, or must it be written manually from the calibration results?

  5. Is AutoMagicCalib the intended/recommended tool for generating MV3DT-compatible camInfo files, or is there another calibration approach (e.g., OpenCV checkerboard, manual matrix construction) that is more reliable for this use case?

MV3DT need camera overlap to associate targets across cameras. If there is very limited camera overlap, can you use SV3DT? You can use OpenCV method to get the projectionMatrix_3x4: Gst-nvtracker — DeepStream documentation

Hi kesong,

Thank you for the suggestion. I have since repositioned the cameras — imagine a long corridor with points A, B, and C from one end to the other. Camera 0 is now mounted at position A and Camera 1 at position C, both pointing toward B (the center). This gives the two cameras a much larger shared field of view covering the B zone.

However, the calibration still fails with the same error:

ERROR - Number of matching tracklet candidates is less than 3

So the tracklet matching issue persists even with this new layout.

Regarding SV3DT: my ultimate goal is cross-camera person re-identification (Re-ID) — I want to assign a consistent identity to the same person as they move from the view of Camera 0 into the view of Camera 1. Given this requirement, I would like to ask:

  1. Is SV3DT capable of supporting cross-camera Re-ID, or is it limited to single-camera 3D tracking only? My understanding is that SV3DT operates per-camera independently, which would not support Re-ID across camera views.

  2. If MV3DT is required for cross-camera Re-ID, is there an alternative calibration workflow I can use to produce the required camInfo files and transforms.yml when AutoMagicCalib consistently fails at the tracklet matching step?

  3. Could you clarify whether the corridor layout I described (both cameras covering the same central zone B) should in theory provide sufficient overlap for MV3DT calibration? If so, are there any configuration parameters in mv_amc_config.yaml I should adjust to improve tracklet matching success?

Please find the attached log file for your reference. I hope it helps with the diagnosis.

Thank you very much for your continued support.

calibration.log (156.1 KB)

You need to ensure there are 3 people appear in both camera in the same time. So the AMC can do calibration based on the location of the people.

Thanks for your reply! I’ve already verified my video has 4 people walking around continuously in both camera views for 5 minutes, so the condition should be met. But I’m still getting the same error. Any thoughts on what else I should check?