I have generated synthetic data of a table top scene using dope writer, which captures RGB image, depth image, semantic segmentation, and poses. Poses comprise of location and orientation.
What would be the depth scale of the synthetic generated data such that when I reconstruct the point cloud through open3d APIs it would result into an accurate representation. Can you please let me know how to know the depth scale?
It was observed that the location values were all 100 times the desired location values. For instance, in a table top scene, the camera is placed at a height of 1.2m, whereas the table size is 1m * 1m. However, the same location values in the synthetic data generation comes are [65488, 17.9669075, -116.375427]. It seems to be as they are 100 times the actual values. If so then, why is it so?
Below are the system specifications:
Isaac Sim 2022.2.1
Ubuntu - 20.04
CPU - Intel Core i7
Cores - 24
RAM - 32 GB
GeForce RTX 3060-Ti
VRAM - 16GB
Disk - 2TB SSD