How to determine depth scale of a synthetically generated data from dope writer?

I have generated synthetic data of a table top scene using dope writer, which captures RGB image, depth image, semantic segmentation, and poses. Poses comprise of location and orientation.

  • What would be the depth scale of the synthetic generated data such that when I reconstruct the point cloud through open3d APIs it would result into an accurate representation. Can you please let me know how to know the depth scale?

  • It was observed that the location values were all 100 times the desired location values. For instance, in a table top scene, the camera is placed at a height of 1.2m, whereas the table size is 1m * 1m. However, the same location values in the synthetic data generation comes are [65488, 17.9669075, -116.375427]. It seems to be as they are 100 times the actual values. If so then, why is it so?

Below are the system specifications:

Isaac Sim 2022.2.1
Ubuntu - 20.04
CPU - Intel Core i7
Cores - 24
RAM - 32 GB
GeForce RTX 3060-Ti
VRAM - 16GB
Disk - 2TB SSD

Hi Team, could you please let me know if there is any update on this.

Hi @deveshkumar21398 I’m moving this to the Isaac Sim forum, since I think you’ll have more chances to get answers about the dope writer there.

Hi there,

how are you getting the depth data using the dop writer? In the provided tutorial the writer returns rgb and a json file with the pose and the projected cuboid.

If you have a custom writer using depth data, here is some information on the annotators:

Best,
Andrei

1

Any idea why the center of projection is quite near but the orientation is not correct? Below is the code and the correct pose of the object in isaac sim attached at the end.

import numpy as np
import cv2
import json
import os
import matplotlib.pyplot as plt

def project2d(intrinsic, point3d):
        return (intrinsic @ (point3d / point3d[2]))[:2]


axis_len = 30

image_path = '/Test_Prop_test/000013.png' 

json_path = '/Test_Prop_test/camera_params_000013.json'
npy_path = '/Test_Prop_test/bounding_box_3d_000013.npy'

img = cv2.imread(image_path)

image= np.array(img)
height, width = image.shape[:2]

cam_intrinsic = np.array([
                            [641.37, 0.0, 320],
                            [0.0, 659.22, 240],
                            [0.0, 0.0, 1.0]])

with open(json_path, 'r') as file:

        data = json.load(file)
        camera_view_transform = data["cameraViewTransform"]
        camera_view_transform = np.array(camera_view_transform).reshape(4, 4)
        T_cam_world = camera_view_transform.T
        print("T_cam_world : ", T_cam_world )

        camera_projection = data["cameraProjection"]
        camera_projection = np.array(camera_projection).reshape(4, 4)
        camera_projection = camera_projection.T

BBtrans = np.load(npy_path, allow_pickle=True).tolist()

for entry in BBtrans:
        obj_id = entry[0]+1
        transformation_matrix = entry[7]
        T_obj_to_world = transformation_matrix.T

if obj_id==1:
                # Calculate T_world_obj by taking inverse of T_obj_world
                T_world_to_obj = np.linalg.inv(T_obj_to_world)
                # Calculate T_world_to_cam by taking the inverse of T_cam_to_world
                T_world_to_cam = np.linalg.inv(T_cam_world)
                T_cam_to_obj = T_world_to_obj @ T_world_to_cam
                RT =  np.linalg.inv(T_cam_to_obj)
 RT_centre = RT[:3, -1]
                RT_centre = cam_intrinsic @ (RT_centre / RT_centre[2])


                RT_centre = RT_centre[:2]
                print("obj_center: ", RT_centre)
                cv2.circle(img, tuple(RT_centre.astype(int)), radius=2, color=(0, 255, 0), thickness=5)

                rgb_colors = [(0, 0, 255), (0, 255, 0), (255, 0, 0)]
                for j in range(3):
                        obj_xyz_offset_2d = project2d(cam_intrinsic, RT[:3, -1] + RT[:3, j] * 0.001)
                        obj_axis_endpoint = RT_centre + (obj_xyz_offset_2d - RT_centre) / np.linalg.norm(obj_xyz_offset_2d - RT_centre) * axis_len
                        print("obj_xyz_offset_2d: ", obj_xyz_offset_2d)
                        print("obj_axis_endpoint", obj_axis_endpoint)
                        cv2.arrowedLine(img, (int(RT_centre[0]), int(RT_centre[1])), (int(obj_axis_endpoint[0]), int(obj_axis_endpoint[1])), rgb_colors[j], thickness=2, tipLength=0.15)


img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
plt.imshow(img_rgb)
plt.title('Image with Arrow')
plt.axis('off')
plt.show()