I’ve been trying out 2 different networks on TAO toolkit: PointPillars and CenterPose.
PointPillars takes point clouds data and KITTI-formatted annotations as the inputs while CenterPose takes a 2D image and a .json file containing necessary information for training and the intrinsic matrix of the camera is also required.
I’m thinking about the possibility of training both networks for the purpose of 3D virtual fences, in which people or some other certain objects such as cars need to be annotated.
Currently I’ve downloaded some open dataset for 3D object detection aside from KITTI dataset, if I wanna add them into the training set, conversion is inevitable. What are the things that need to be taken care of when doing this?
It seems that currently there’s only Objectron dataset containing only 8 classes. Is there any annotation tool for creating my own dataset that’s to be used for training CenterPose?
As for the first question, I do know how the training data needs to be placed when training CenterPose. It’s just that I have no idea how to produce the .json file containing tons of information of the corresponding object.
On the other hand, I’ve also been working on using other available point cloud data and have converted some of them into .bin files, but I guess I still need to work on producing corresponding .txt files containing KITTI-formatted info.
The line above is an example of an annotation of a mask containing ONLY 2D bbox info, while the rest of the values aside from the class name are set to 0.0.
I’d like to know a few things regarding training PointPillarNets:
Is the 2D bbox info, i.e, the 4 non-zero values shown above, unnecessary?
I know that the last 7 zeroes indicates 3D object info, but are they all neccesary when training PointPillarNets or I just need some of them?
It seems that the Objectron annotation tool isn’t open-sourced so currently we can’t really train CenterPose using custom dataset as we have no way to produce it.
I later managed to convert other pointcloud dataset into KITTI format and it looks like the following form:
obj_type 0 0 0 0 0 0 0 h w l x y z yaw
I managed to draw bboxes on the corresponding pointcloud data using open3d and the bboxes are correct.
I downloaded some open synthetic data, PreSIL, which contains both point cloud and annotated data, but intensity value for the point clouds is ALL 0 according to their paper and I also verified this using open3d. I wonder if that can make a difference as training by adding the PreSIL data and the existing KITTI dataset into the training set didn’t make the result better as expected…
There is no update from you for a period, assuming this is not an issue anymore.
Hence we are closing this topic. If need further support, please open a new one.
Thanks
The user might need to collet their USD for their customer dataset and create the synthetic dataset in Omniverse. The tool will output the 3D labels, which can be used to train the CenterPose model.