How to generate objects binary masks for synthetic data offline pose generation


I have a question regarding synthetic data offline pose generation. I have a specific requirement for my model where I need masks of objects in each frame. Currently, I have generated visibility masks, which are binary images representing the visible parts of each object. However, I also need the complete masks of objects, including the parts that are not visible. I was able to retrieve the visibility masks using the semantic segmentation annotator. Can anyone provide advice or suggestions on how to solve this problem?

Hi @francesco.sarno,

Are you using your own custom writer or one of the existing writers in the example?

If you have the original object mesh file as an .obj or .ply, one hacky solution I can think of is reprojecting the mesh back onto the image using the ground truth annotations. That way, you can get a complete object mask.

If you think this could work for you, let me know and I can find some scripts that do the reprojection.

In the meantime, I will see if it is possible to directly get the complete object masks when you generate data.


How precise are you looking to be?

You can use the Bounding Box Loose annotator if working with bounding boxes for annotations.

If you want an exact mask for a small amount of items, you could do it with multiple triggers and modifying visibility for a group of objects NOT the item(s) you are trying to detect


occlusion_group = # Items I want to hide)

with rep.trigger.on_frame(num_frames=5, interval=2): # Randomize ever-other frame
with rep.trigger.on_frame(num_frames=10): # Toggle visibility every frame
    with occlusion_group:
        rep.modify.visibility(rep.distribution.sequence([True, False]))

If you are looking to do exact pixel masks for every object in your scene it is not something that is done easily, but you could do something like the above with every object in the scene. It would not be performant, and would need a lot of data-managing afterwards

Hi @andrew_g,

Thank you for your answer! Yes, I am running my own writer scripts, and I have a .obj file for the mesh of the objects. I think that your solution can work. If you can help me with the scripts you mentioned, it would be great!


Hi @francesco.sarno,

Here is the script and a simple example: (60.0 MB)

Note that I am not sure what annotation format you are using. The script expects annotations in the BOP format. But if you are using a different format, it should be pretty straightforward to change the way the script reads in the annotations.

For this particular example, you can run the script using the command:

python --path_json 001001/scene_gt.json --mesh_path obj_000003.ply

Let me know if you run into any trouble.



hi @andrew_g,

Thank you so much for your help. I actually tried your script with my custom data. However, there seems to be a mismatch between the real object’s appearance and the reprojected one. I mean, the object pose is completely different. Do you know why this could happen?

I generated the .json file containing the camera and object pose using Isaac sim default functions.