Training data preparation for 3D object detection networks

I’ve been trying out 2 different networks on TAO toolkit: PointPillars and CenterPose.
PointPillars takes point clouds data and KITTI-formatted annotations as the inputs while CenterPose takes a 2D image and a .json file containing necessary information for training and the intrinsic matrix of the camera is also required.

I’m thinking about the possibility of training both networks for the purpose of 3D virtual fences, in which people or some other certain objects such as cars need to be annotated.

  1. Currently I’ve downloaded some open dataset for 3D object detection aside from KITTI dataset, if I wanna add them into the training set, conversion is inevitable. What are the things that need to be taken care of when doing this?

  2. It seems that currently there’s only Objectron dataset containing only 8 classes. Is there any annotation tool for creating my own dataset that’s to be used for training CenterPose?

1 Like

You can take a look in Data Annotation Format - NVIDIA Docs.

The similar question as Video 3D Bounding Box Annotation tool for Objectron · Issue #61 · google-research-datasets/Objectron · GitHub. You can ask again. And also search something via website. Maybe you can take a look at GitHub - walzimmer/3d-bat: 3D Bounding Box Annotation Tool (3D-BAT) Point cloud and Image Labeling.

Thanks as always.

As for the first question, I do know how the training data needs to be placed when training CenterPose. It’s just that I have no idea how to produce the .json file containing tons of information of the corresponding object.

On the other hand, I’ve also been working on using other available point cloud data and have converted some of them into .bin files, but I guess I still need to work on producing corresponding .txt files containing KITTI-formatted info.

For instance:
Mask 0 0 0.0 156 279 451 590 0 0 0 0 0 0 0

The line above is an example of an annotation of a mask containing ONLY 2D bbox info, while the rest of the values aside from the class name are set to 0.0.

I’d like to know a few things regarding training PointPillarNets:

  1. Is the 2D bbox info, i.e, the 4 non-zero values shown above, unnecessary?
  2. I know that the last 7 zeroes indicates 3D object info, but are they all neccesary when training PointPillarNets or I just need some of them?

I’ll look into 3D-bat. Thanks.

In TAO Centerpose, tao_tutorials/notebooks/tao_launcher_starter_kit/centerpose/centerpose.ipynb at main · NVIDIA/tao_tutorials · GitHub, it is using Objectron dataset.
For annotation, maybe you can refer to Annotation tool and Synthetic dataset Generation · Issue #6 · google-research-datasets/Objectron · GitHub.

They are needed. You can refer to PointPillars - NVIDIA Docs.

It seems that the Objectron annotation tool isn’t open-sourced so currently we can’t really train CenterPose using custom dataset as we have no way to produce it.

I later managed to convert other pointcloud dataset into KITTI format and it looks like the following form:

obj_type 0 0 0 0 0 0 0 h w l x y z yaw

I managed to draw bboxes on the corresponding pointcloud data using open3d and the bboxes are correct.

I downloaded some open synthetic data, PreSIL, which contains both point cloud and annotated data, but intensity value for the point clouds is ALL 0 according to their paper and I also verified this using open3d. I wonder if that can make a difference as training by adding the PreSIL data and the existing KITTI dataset into the training set didn’t make the result better as expected…

There is no update from you for a period, assuming this is not an issue anymore.
Hence we are closing this topic. If need further support, please open a new one.
Thanks

You may also take a look at nuscenes dataset as well.
nuScenes Dataset | Papers With Code,
nuscenes_tutorial,
https://www.nuscenes.org/nuscenes#data-annotation.

More,you can use Omniverse Isaac Sim to create the synthetic dataset to train the model.

Please see the following documentation: 10.10. Object Detection Synthetic Data Generation — Omniverse IsaacSim latest documentation

The user might need to collet their USD for their customer dataset and create the synthetic dataset in Omniverse. The tool will output the 3D labels, which can be used to train the CenterPose model.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.