Replication output data types - pose?

peter.gaston · August 25, 2022, 7:03pm

It would be very useful for “pose” to be an Output data type - along w/ rgb, instance, semantic_seg, etc.

As question - is there a way to get “pose” out of replicator composer?

This would be used to feed an object pose algorithm (I’m currently using EfficientPose)

Thanks!
p

WendyGram · August 29, 2022, 4:12pm

Hello @peter.gaston! I’ve reached out to the team about your questions. I will report back here when I have more information!

jiehanw · August 29, 2022, 7:43pm

Hi @peter.gaston , you can use the transform data output by the bounding_box_3d annotator as the pose.

peter.gaston · September 5, 2022, 2:27pm

thx! lots of hidden data lurking around, eh? Excellent!

mayank.ukani · March 31, 2023, 5:37pm

Hi @peter.gaston.

I hope you are doing well. I am also interested in working with 6D pose estimation deep learning methods. Were you able to train any model with your dataset? All the work I see is based on the YCB video dataset only.

Thanks,
Mayank

peter.gaston · March 31, 2023, 7:06pm

Not sure exactly what your question is - I’ll throw out some ideas - feel free to be more specific…

I have an ML model doing pose estimation using synthetic data (and real data). The synthetic data is composed primarily using replicator. Per the topic here, replicator does not expose the camera pose. However, if one sets the camera at, say 0,0,0.5 pointing 0,0,90 (or whatever) - then one can easily deduce the camera pose. i.e., don’t move the camera - move everything else. I created 65,000 images for my initial domain randomization training - and several thousand more so far to test in more reasonable conditions - see below.)

For my case, it’s not really 6D pose, given the exact environment (pallets in a warehouse) it’s really only X, Y and a yaw. The floor constrains the rest.

We’ve played with various models. We’ve used EfficientPose, a 2 stage mask-RCNN followed by either a direct to pose ML or a geometry based algorithm, or currently a key point based approach followed by a geometry algorithm. Your mileage will vary. We like the key point as it’s human explainable for failure modes - and seems easier to understand how to identify ways to further train the model to fix those failures.

So I would recommend using replicator to create a boat-ton of synthetic images to train on and work from there.

Example image:

mayank.ukani · March 31, 2023, 7:17pm

Thanks a lot for such a quick and detailed reply. I have few more questions please bear with me as I am new to this.

For pose-related models is a camera pose required? (As you mentioned to keep the camera fixed)
When I calculated the camera intrinsic matrix (using this guide) to convert the depth image to pointcloud, I noticed that it is not accurate. How are you calculating the camera intrinsic matrix ?

peter.gaston · March 31, 2023, 7:34pm

Is a camera pose required. Well, yes. You want the ground truth position of whatever you’re looking at in relation to the camera.

To get the camera matrix, I cheated. I used another method that works and outputs the camera pose (incl intrinsic matrix). section 2.6 on page https://docs.omniverse.nvidia.com/app_isaacsim/app_isaacsim/tutorial_replicator_recorder.html - except my code is:

import omni.replicator.core as rep

with rep.new_layer():
camera = rep.create.camera()
with rep.trigger.on_frame():
with camera:
rep.modify.pose(
position=(0.0,0.0,0.35),
rotation=(0, 0, 180),
)

mayank.ukani · April 6, 2023, 8:54am

Hi @peter.gaston. I thank you for replying so quickly with all the details.

I assume that you are using transform data from “3D bounding box” to get the rotation and translation of the object. Were you able to achieve good accuracy using the synthetic data ?

peter.gaston · April 6, 2023, 12:13pm

If you mean good accuracy from the transforms, yes. Spot on. Simple matrix multiplication. threeDXForm is the 3d transform from Isaac. camTransform is what Isaac would return if they had implemented that.

        self.fixedXf = np.matmul(self.palletFacePts,self.threeDXForm)
        self.camPalletPts = np.matmul(self.fixedXf,self.camTransform)

If you mean good pose data, that depends on the algorithm. I currently use a two-stage approach. First stage is key points - they’re averaging 1-2 pixels off (L2) which is very good, all things considering. Then I do some geometry to get pose - which is fine. So meeting targets at present. My ML algorithm has an input of 1/3 the size of the image - so that loses pixels. And of course one can always be off one pixel due to rounding. And the ML key point itself can lose. So all in, I’m happy.

Topic		Replies	Views
Replicator - how to get cameraViewTransform Synthetic Data Generation (SDG) synthetic-data	9	2147	October 19, 2023
Export camera pose in Isaac Sim Synthetic Data Generation (SDG) camera	0	422	December 29, 2022
Synthetic Data Recording - transform between two objects Isaac Sim synthetic-data	11	1513	April 5, 2024
How to get the camera data for CenterPose model training Synthetic Data Generation (SDG) camera	2	552	July 21, 2023
How would I use replicator to generate 2d keypoint data? Isaac Sim synthetic-data , synthetic-data-generation , isaac-sim-v5-0-0	3	119	July 27, 2025
Output the extrinsic and intrinsic camera parameters during synthetic data generation Synthetic Data Generation (SDG) camera , synthetic-data , isaacsim	1	429	May 10, 2024
Randomized stereo camera that follows a trajectory in Replicator for Data Generation Synthetic Data Generation (SDG) camera	3	568	November 24, 2022
Isaac Sim 5.0.0 - Pose Estimation Data Generation Synthetic Data Generation (SDG)	20	166	November 12, 2025
Replicator camera pose not updating using rep.modify.pose Synthetic Data Generation (SDG) isaacsim , omniverse , synthetic-data-generation	2	120	February 19, 2025
How to generate synthetic pose data from Isaac Sim for objects with symmetries Isaac Sim	3	140	May 28, 2025

Replication output data types - pose?

Related topics