Sythetic data recording for BBox3D

xuxu · July 17, 2021, 3:07pm

Hi, there is BBox 3D in Synthetic Data Visualizer, but I can’t get BBox3D output directly through Synthetic Data Recorder at the moment, is that right? Will this function be added to the extension in the future?
If I want to get it, is it possible through python script?

xuxu · July 19, 2021, 10:22am

I try to modify the code of the syntheticdata_recorder extension to implement saving the image of the 3D boundingbox from the GUI as well as the npy file, and saving the camera parameters.

It looks not bad.

xuxu · July 19, 2021, 10:30am

Here’s the codes I modified, if that helps. There may be some errors and the original file needs to be backed up first.

syntheticdata_recorder.py (40.6 KB)
visualization.py (6.3 KB)
writer.py (11.4 KB)

ltorabi · July 19, 2021, 4:20pm

Hi xuxu,

Awesome! Well done!
Looks really good.

Thanks for the code.

Kindly,
Liila

xuxu · July 22, 2021, 5:27pm

Hello, I am still trying to call the get_pose interface, but this is not going well. when it is calling

        for sensor in gt_sensors:
            if sensor not in ["camera", "pose"]:
                if sensor == "instanceSegmentation":
                    gt[sensor] = self.sensor_helpers[sensor](viewport, parsed=True, return_mapping=True)
                elif sensor == "boundingBox3D":
                    gt[sensor] = self.sensor_helpers[sensor](viewport, parsed=True, return_corners=True)
                else:
                    gt[sensor] = self.sensor_helpers[sensor](viewport)
                current_sensor = self.sensor_helper_lib.create_or_retrieve_sensor(viewport, self.sensor_types[sensor])
                current_sensor_state = self.sd_interface.is_sensor_initialized(current_sensor)
                sensor_state[sensor] = current_sensor_state
            else:
                gt[sensor] = self.sensor_helpers[sensor](viewport)

in syntheticdata.py, it returns error [Error] [carb.events.python] TypeError: get_pose() takes 1 positional argument but 2 were given
because it is defined as

    def get_pose(self):
        """Get pose of all objects with a semantic label.
        """
        stage = omni.usd.get_context().get_stage()
        mappings = self.generic_helper_lib.get_instance_mappings()
        pose = []
        for m in mappings:
            prim_path = m[0]
            prim = stage.GetPrimAtPath(prim_path)
            prim_tf = UsdGeom.Xformable(prim).ComputeLocalToWorldTransform(0.0)
            pose.append((str(prim_path), m[1], str(m[2]), np.array(prim_tf)))
        return pose

in the same file.
This parameter does not seem to be used, but when I remove it, an error about c++ will be returned. Do you know how to fix it?

[Error] [carb.events.python] ArgumentError: Python argument types in
    None.GetPrimAtPath(Stage, numpy.int32)
did not match C++ signature:
    GetPrimAtPath(pxrInternal_v0_20__pxrReserved__::UsdStage {lvalue}, pxrInternal_v0_20__pxrReserved__::SdfPath path)

At:
  /home/ubuntu/.local/share/ov/pkg/isaac_sim-2021.1.1/exts/omni.isaac.synthetic_utils/omni/isaac/synthetic_utils/scripts/syntheticdata.py(135): get_pose

sdebnath · July 24, 2021, 12:44am

Hi @xuxu , Can you try with the updated get_pose() below?

    def get_pose(self, viewport=None):
        """Get pose of all objects with a semantic label.
        """
        stage = omni.usd.get_context().get_stage()
        mappings = self.generic_helper_lib.get_instance_mappings()
        pose = []
        for m in mappings:
            prim_path = m[1]
            prim = stage.GetPrimAtPath(prim_path)
            prim_tf = omni.usd.get_world_transform_matrix(prim)
            pose.append((str(prim_path), m[2], str(m[3]), np.array(prim_tf)))
        return pose

xuxu · July 25, 2021, 10:16am

Cool! It works. It just throws an error only when recording data for the first time, but it seems to have no effect.

Viewport  :  ['rgb', 'boundingBox3D', 'camera', 'pose']
2021-07-25 10:02:43 [148,065ms] [Error] [carb.python] [py stderr]: /home/ubuntu/.local/share/ov/pkg/isaac_sim-2021.1.1/kit/extscore/omni.kit.pip_archive/pip_prebundle/numpy/core/_asarray.py:136: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray
  return array(a, dtype, copy=False, order=order, subok=True)

I can get a 4x4 transfer matrix for each object on the entire stage (though I thought it was the object in the current perspective). After a few tries I learned that this transfer matrix is between the object coordinate system and the world coordinate system. (also, the transfer matrix obtained from 3DBox)

 ['/Warehouse/SM_CardBoxA_02/SM_CardBoxA_02', 2, 'CardBox',
        array([[ 3.37917009e-01,  9.41176037e-01,  0.00000000e+00, 0.00000000e+00],
               [-9.41176037e-01,  3.37917009e-01,  0.00000000e+00, 0.00000000e+00],
               [ 0.00000000e+00,  0.00000000e+00,  1.00000000e+00, 0.00000000e+00],
               [-9.21907471e+02,  1.04652783e+03,  1.29935669e+02, 1.00000000e+00]])                                ]],
      dtype=object)

Now I want to get the point cloud data (i.e. xyz coordinate data of the points) from the saved rgb image data and depth image data. As far as I know I need the Intrinsic parameters of the camera (fx, fy, cx, cy) and then I can calculate it. But how can I get these parameters? Or is there any other way to get these point cloud data?
Thanks a lot.

xuxu · July 25, 2021, 5:16pm

I get the answer from [A few snippets which might be useful to compute camera intrinsics]

# compute focal point and center
focal_x = height * focal_length / vert_aperture
focal_y = width * focal_length / horiz_aperture
center_x = height * 0.5
center_y = width * 0.5

Now I can get the xyz coordinates of the points, if I’m right, these are in the camera coordinate system? So I need to transform them into the world coordinate system (using the camera’s transformation matrix relative to the world coordinate system) and then I can match the coordinates I got for the 3D bbox above.

sdebnath · July 30, 2021, 6:02pm

@xuxu Once you have the camera intrinsics and extrinsics (camera’s transformation matrix relative to the world coordinate system), you can use them to get xyz pointcloud in world coordinates from depth image.

The 3D bbox also outputs xyz in world coordinates

xuxu · July 30, 2021, 7:09pm

Thank you for your reply! I am currently trying to use Isaacsim to make a dataset like SUN RGBD for Votenet. It is not easy at the beginning. A small question is that when returning the camera parameters, only ‘horizontal_aperture’ is returned, not ‘vert_aperture’. But I can read this parameter in Isaacsim to calculate focal_x and focal_y. (Sometimes they are not equal)

sdebnath · July 30, 2021, 7:57pm

@xuxu I have seen the issue sometimes when the ratio of horizontal_aperture and vert_aperture is not equal to resolution_width and resolution_height as the horizontal_aperture and vert_aperture USD attributes are pre-filled with default values.

OPTION 1

Usually, that gets fixed once I set a camera as the active camera for that viewport.

import omni
viewport_handle_1 = omni.kit.viewport.get_viewport_interface().get_instance("Viewport")
viewport_window_1 = omni.kit.viewport.get_viewport_interface().get_viewport_window(viewport_handle_1)
viewport_window_1.set_active_camera("/World/Camera")

After you set the active camera using the above script and then read USD attributes, the ratio of horizontal_aperture and vert_aperture will be equal to resolution_width and resolution_height

OPTION 2

Calculate vertical_aperature as follows vert_aperture = height/width * horiz_aperture

xuxu · August 13, 2021, 10:34am

Hi, there may be a bug when using get_bounding_box_3d, maybe you could try if you can reproduce it? Specifically, if there is an object in the current view that is out of viewport, then an error is returned.

If no object is beyond the view limit, there is no problem. As shown in the picture. Thanks.

sdebnath · August 26, 2021, 7:20am

@xuxu I created a simple script that can be run in the script editor and it dumps out 3D bounding box output. I am not able to repro your issue with this script and a simple map (attached herewith).
bbox_test.usd (19.9 KB)
get_bounding_box_3d.py (2.1 KB)

system · October 25, 2021, 7:20am

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Get point cloud and 3d bbox data in synthetic data Isaac Sim synthetic-data , isaacsim	1	569	April 5, 2024
Bounding_box_3d writer doesn't work in Synthetic Data Recorder Isaac Sim	2	317	January 9, 2024
Get sensor data from isaac sim Isaac Sim kaya , synthetic-data	4	1880	April 5, 2024
Output the extrinsic and intrinsic camera parameters during synthetic data generation Synthetic Data Generation (SDG) camera , synthetic-data , isaacsim	1	456	May 10, 2024
Replicator - how to get cameraViewTransform Synthetic Data Generation (SDG) synthetic-data	9	2202	October 19, 2023
Capture frames from camera via Python scripting Isaac Sim python , synthetic-data	36	7814	April 5, 2024
export 3D bounding boxes Isaac Sim	3	952	April 5, 2024
Camera data in numpy format in Isaac sim Isaac Sim synthetic-data	2	1110	August 3, 2022
Mathematical definition of 3D Bounding Boxes Annotator Nvidia Omniverse Isaac Sim Isaac Sim	2	1387	April 5, 2024
Offline Dataset Generation(generator.py) - get_groundtruth rgb -> returns gray scale RGB Isaac Sim synthetic-data	4	949	April 5, 2024

Sythetic data recording for BBox3D

OPTION 1

OPTION 2

Related topics