TAO Toolkit Sparse4D Transfer Learning Issue Report, High-confidence false positives from unmatched queries

mito-w · March 9, 2026, 2:30am

TAO Toolkit Sparse4D Transfer Learning Issue Report

High-confidence false positives from unmatched queries

Evidence video: video.mp4

1) Executive Summary

During transfer learning with nvcr.io/nvidia/tao/tao-toolkit:6.25.11-pyt, Sparse4D outputs many high-confidence detections that are not assigned stable tracking IDs and are not removed by TopK + score-threshold filtering.

Code inspection indicates a likely training bug: queries not matched to GT are assigned a background label (num_cls) but are excluded from classification loss in TAO’s custom focal-loss implementation.

As a result, a large portion of unmatched queries receive no negative classification supervision, which can lead to score inflation and many false positives at inference.

2) Scope and Source Artifacts

Target container image: nvcr.io/nvidia/tao/tao-toolkit:6.25.11-pyt
TAO Sparse4D source extracted from container path:
/usr/local/lib/python3.12/dist-packages/nvidia_tao_pytorch/cv/sparse4d/...
Upstream comparison source:
Horizon Robotics Sparse4D repository: HorizonRobotics/Sparse4D
Upstream sparse4d_head.py: raw file
Reference DETR implementation:
facebookresearch/detr/models/detr.py
Reference mmdetection focal-loss behavior used by upstream Sparse4D:
mmdet focal_loss.py (v2.28.2)

3) TAO Code Evidence (Direct)

3.1 Unmatched queries are explicitly labeled as background (`num_cls`)

File: nvidia_tao_pytorch/cv/sparse4d/model/detection3d/target.py

TAO initializes classification targets for all predictions as num_cls, then overwrites only matched predictions with GT class labels:


output_cls_target = (

cls_target[0].new_ones([bs, num_pred], dtype=torch.long) * num_cls

)

...

output_cls_target[i, pred_idx] = cls_target[i][target_idx]

This means all unmatched queries remain with label num_cls (background/no-object semantics).

3.2 Regression loss is computed only for matched queries

File: nvidia_tao_pytorch/cv/sparse4d/model/criterion.py

Regression is masked by non-zero target boxes:


mask = torch.logical_not(torch.all(reg_target == 0, dim=-1))

...

reg_target = reg_target.flatten(end_dim=1)[mask]

reg = reg.flatten(end_dim=1)[mask]

So unmatched queries do not contribute regression loss (this is expected for DETR-like training).

3.3 TAO custom focal loss excludes background-labeled queries from classification loss

File: nvidia_tao_pytorch/cv/sparse4d/model/criterion.py (custom FocalLoss)

TAO filters valid samples with:


num_classes = pred.size(1)

valid_mask = (target >= 0) & (target < num_classes)

...

pred = pred[valid_mask]

valid_target = target[valid_mask]

Because unmatched queries are encoded as target == num_classes, they fail target < num_classes and are dropped from classification loss entirely.

3.4 Inference keeps top scores directly from sigmoid logits

File: nvidia_tao_pytorch/cv/sparse4d/model/detection3d/decoder.py

Inference applies sigmoid + TopK + threshold:


cls_scores = cls_scores[output_idx].sigmoid()

cls_scores, indices = cls_scores.flatten(start_dim=1).topk(self.num_output, dim=1, sorted=self.sort_results)

...

mask = cls_scores >= self.score_threshold

If unmatched queries are not negatively supervised during training, they can keep inflated scores and survive this filter.

4) Upstream Sparse4D / DETR Comparison

4.1 Upstream Sparse4D uses same target encoding (`num_cls` for unmatched) but different loss backend

In upstream Sparse4D, unmatched target encoding is also num_cls (target.py), and loss_cls is built via mmdetection:

self.loss_cls = build(loss_cls, LOSSES) in upstream head.

4.2 mmdetection focal loss maps `num_classes` label to all-zero one-hot (negative supervision retained)

Reference mmdetection code:


target = F.one_hot(target, num_classes=num_classes + 1)

target = target[:, :num_classes]

For target == num_classes, one-hot becomes all zeros over foreground classes, which still contributes BCE/focal negative loss on all class logits.

4.3 DETR also supervises unmatched queries as no-object

In DETR:


target_classes = torch.full(src_logits.shape[:2], self.num_classes, ...)

target_classes[idx] = target_classes_o

loss_ce = F.cross_entropy(..., target_classes, self.empty_weight)

Unmatched queries are explicitly supervised as no-object class, preventing uncontrolled high confidence.

5) Root Cause Hypothesis

The TAO Sparse4D implementation in 6.25.11-pyt appears to combine:

Background target encoding: unmatched queries labeled as num_cls, and
Custom focal-loss filtering: keeps only target < num_classes.

This combination removes classification loss for unmatched queries, unlike upstream Sparse4D + mmdetection behavior and DETR-style no-object supervision.

6) Practical Impact

High confidence is not sufficiently penalized on unmatched queries.
TopK + threshold post-filtering cannot suppress enough false detections.
Tracking receives many spurious detections, causing unstable IDs and large false-positive volumes.

7) Reproducibility Note (Source Extraction)

Sparse4D TAO source was extracted from the container for inspection using standard Docker copy flow (docker create + docker cp) from:

/usr/local/lib/python3.12/dist-packages/nvidia_tao_pytorch/cv/sparse4d/

8) Key File Locations to Inspect in TAO Image

nvidia_tao_pytorch/cv/sparse4d/model/detection3d/target.py
nvidia_tao_pytorch/cv/sparse4d/model/criterion.py
nvidia_tao_pytorch/cv/sparse4d/model/detection3d/decoder.py
nvidia_tao_pytorch/cv/sparse4d/model/sparse4d_pl_model.py

Because it cannot be completely ruled out that this issue may be caused by configuration settings, we will attach the experiment.yaml used for the transfer learning shown in the video at the beginning. Please review its contents.
experiment_yaml.txt (8.5 KB)

Morganh · March 10, 2026, 8:09am

Thanks for the detailed report. From the spec you shared, there is not an obvious configuration mistake that would directly explain the reported high-confidence false positives by itself. May I know if you ever run the official notebook along with the dataset mentioned in notebook? It is useful to compare against the official Sparse4D notebook / baseline workflow as an A/B reference. If the same symptom can also be observed there, it would strengthen the case that this is an implementation-level issue rather than something specific to the custom transfer-learning setup. If the notebook baseline does not reproduce it, it would help narrow down the triggering conditions.

More, please try below change for nvidia_tao_pytorch/cv/sparse4d/model/criterion.py based on your findings.

-        valid_mask = (target >= 0) & (target < num_classes)  # ignore negative/out-of-range
+        target = target.long()
+        valid_mask = target >= 0  # keep foreground and background; ignore only negative labels
         if not valid_mask.any():
-            # No valid samples, return zero loss
-            return pred.new_tensor(0.0)
+            return pred.sum() * 0

         pred = pred[valid_mask]
         valid_target = target[valid_mask]
         if weight is not None:
             weight = weight[valid_mask]

-        # Now safe to do one_hot
-        one_hot_target = F.one_hot(valid_target, num_classes=num_classes).float()
+        one_hot_target = F.one_hot(
+            valid_target.clamp(max=num_classes),
+            num_classes=num_classes + 1
+        )[:, :num_classes].to(dtype=pred.dtype)

mito-w · March 10, 2026, 9:47am

Thank you for your response.
I have previously performed transfer learning using the official notebook and the official dataset, with the default training settings.

I will conduct the A/B test you suggested and compare the results.

Thank you.

Topic		Replies	Views
Unable to train yolov4 with Tao succesfully TAO Toolkit	6	628	April 28, 2023
Inference Confidence varies (0.1 to 0.99) in YOLOv4 Compared to YOLOv3 TAO Toolkit yolo , tao	10	922	November 23, 2023
Status of Sparse4D TensorRT Deployment in TAO 6.25 TAO Toolkit tensorrt	5	52	February 8, 2026
Trining TAO Toolkit results in 0.0000% accuracy TAO Toolkit	7	579	February 23, 2024
Invalid loss on YOLO v4 model with latest TAO release TAO Toolkit	8	954	January 4, 2022
Problem of tao detectnet_v2 evaluate 0% TAO Toolkit python	21	622	July 7, 2023
Performance of TAO 3.22.05 and TAO 4.0.1 is lower than TAO 3.21.08 TAO Toolkit	9	627	June 15, 2023
Yolo-v4 on colab - ModuleNotFound - No module named 'uff' TAO Toolkit tao	18	646	March 14, 2024
Tao training - Visualise inference after training provides 98% accuracy, however, after model export to TensorRT, the inference result is 0% TAO Toolkit	5	703	March 12, 2022
Multiple classes not detected? TAO Toolkit	19	1160	October 12, 2021