Save camera position and rotation with Writer for offline dataset generation

Leopold_M · April 25, 2023, 10:07am

Hello,
I would like to create a Writer that for each image register save the position and rotation of the camera in a numpy array and save the numpy array at the end.
I code this :

class ArenaWriter(Writer):
    def __init__(
        self,
        output_dir,
        rig,
        image_output_format="png",
        n_data=10,
        n_attr=5
    ):

        self._output_dir = output_dir
        self._backend = BackendDispatch({"paths": {"out_dir": output_dir}})
        self._frame_id = 0
        self._image_output_format = image_output_format
        self._rig=rig
        self._attributes = np.zeros(shape=(n_data, n_attr))

        self.annotators = [AnnotatorRegistry.get_annotator("rgb")]

    def write(self, data):

        im_array = np.array(data['rgb'])
        im_PIL = Image.fromarray(np.uint8(im_array))
        im_PIL = im_PIL.resize((128,64), Image.ANTIALIAS)
        im_array = np.array(im_PIL)
        filepath = f"rgb_{self._frame_id}.{self._image_output_format}"

        cam_pos = self._rig.get_world_pose()
        pos = cam_pos[0]
        rot = Rotation.from_quat(cam_pos[1])
        rot_euler = rot.as_euler('xyz', degrees=True) 

        self._attributes[self._frame_id,:2] = pos[:2]
        self._attributes[self._frame_id,2:] = cam_pos[1]
        self._backend.write_image(filepath, im_array)

        self._frame_id += 1

    def on_final_frame(self):
        np.save(self._output_dir + 'attributes.npy', self._attributes)


def main():

    # Open the environment in a new stage
    print(f"Loading Stage {ENV_URL}")
    open_stage(ENV_URL)

    stage = omni.usd.get_context().get_stage()

    # Create Replicator Camera
    cam = rep.create.camera(
        position=(0, 0, 0.25),
        rotation=(0, 0, 0),
        focal_length=28,
        fisheye_max_fov=110,
        clipping_range=(0.01, 20.)
    )

    cam_node = cam.node
    print(cam_node)
    cam_rig_path = rep.utils.get_node_targets(cam_node, "inputs:prims")[0]
    print(cam_rig_path)
    cam_path = str(cam_rig_path) + "/Camera"    
    print(cam_path)
    rig = XFormPrim(prim_path=cam_rig_path)

    rep.WriterRegistry.register(ArenaWriter)
    writer = rep.WriterRegistry.get("ArenaWriter")
    out_dir = "/root/Documents/arena/data/"
    writer.initialize(output_dir=out_dir, rig=rig, n_data=CONFIG["num_frames"])

    # Create a Replicator render for the Isaac Sim API Camera 
    RESOLUTION = (1600, 1300)
    camera_rp = rep.create.render_product(cam, RESOLUTION)

    # Attach the render to the Writer
    writer.attach([camera_rp])

    with rep.trigger.on_frame():
        with cam:
            rep.modify.pose(
                    position=rep.distribution.uniform((-1.8, -1.3, 0.25), (1.8, 1.3, 0.25)),
                    rotation=rep.distribution.uniform((0, 0, 0), (0, 0, 360))
            )


    for i in range(CONFIG["num_frames"]):
        rep.orchestrator.step()
    writer.on_final_frame()

But I think this is suboptimal in term of time and I prefered to use :

rep.orchestrator.run()

# Wait until started
while not rep.orchestrator.get_is_started():
    simulation_app.update()

# Wait until stopped
while rep.orchestrator.get_is_started():
    simulation_app.update()

rep.BackendDispatch.wait_until_done()
rep.orchestrator.stop()

But with this I have some trouble to save camera position and rotation because there is a mismatch between camera parameters and the image saved.

rthaker · April 27, 2023, 6:04pm

Hi @Leopold_M - Someone from our team will review and respond back.

ahaidu · April 30, 2023, 10:22am

Hi there,

AFAIK, manually using the step() function should not cause any significant overhead.

The issue with the rig pose being wrong could be caused due to an off-by-one frame between the annotators data and the stage. I will look into this and come back to you.

Best,
Andrei

Leopold_M · May 2, 2023, 6:25am

Hello,

Thanks a lot for your answer. Indeed when I use the step function it works well, but seems to take more time than rep.orchestrator.run()

ahaidu · May 2, 2023, 9:10am

I see, I expected due to the large resolution, the AOV processing to take most of the time, and thus the extra processing frame not being noticeable anymore.

For now, you could also try adding multiple cameras to parallelize data processing and possible shortening the relative overhead.

Future releases should not have this issue anymore.

Leopold_M · May 2, 2023, 2:28pm

I tried again, and I think last was chance because in general I don’t have any correspondence between the image and the position of the camera.
I tried to addrep.BackendDispatch.wait_until_done() in the loop, because I thought it will force to wait until all the backend saved are done. But it didn’t solve the problem.
Also I found that the images saved after the two first step are always the same and I don’t know why

ahaidu · May 2, 2023, 4:38pm

Backend dispatch should not influence the writing workflow, it caches the data and writes it to file, once the cache is full it slows down the workflow in order to catch up with writing data to disk.

Are you getting the off-by-one frame issue using step() as well? Can you check using an ordered sequence:

    with rep.trigger.on_frame():
        with cam:
            rep.modify.pose(position=rep.distribution.sequence([(0,0,0), (0,0,1), (0,0,2)]))

Leopold_M · May 3, 2023, 8:20am

Thank you for the advice. I tried with rep.distribution.sequence instead of rep.distribution.uniform.
The attributes saved are correct but the images are not. I noticed the image associated with the first attributes is saved two times. This shifts all the images compared to attributes like this :
im0 im0 im1 im2 …
att0 att1 att2 att3 …
But I have no idea where it comes from because I only use rep.orchestrator.step()

dennis.lynch · May 3, 2023, 5:04pm

Hello,

Have you looked at the camera_params annotator?

Annotators Information — Omniverse Extensions documentation (nvidia.com)

It will give you per-frame data from the rendered camera, and includes a cameraViewTransform parameter that you can derive the position and rotation from.

Leopold_M · May 4, 2023, 8:37am

Hello,

I tried to use cameraViewTransform from CameraParams annotator with a simpler example :

class PrintWriter(Writer):
    def __init__(
        self,
    ):

        self.annotators = [AnnotatorRegistry.get_annotator("CameraParams")]

    def write(self, data):
        print(data["CameraParams"]["cameraViewTransform"])


def main():

    camera = rep.create.camera()

    render_product = rep.create.render_product(camera, (1024, 1024))

    rep.WriterRegistry.register(PrintWriter)
    writer = rep.WriterRegistry.get("PrintWriter")
    out_dir = "/root/Documents/arena/data/"
    writer.initialize()

    # Attach the render to the Writer
    writer.attach([render_product])

    poses = [(0, 0, 0), (1., 0, 0), (0, 1., 0), (0, 0, 1.)]

    with rep.trigger.on_frame():
        with camera:
            rep.modify.pose(
                    position = rep.distribution.sequence(poses)
            )

    for i in range(len(poses)):
        rep.orchestrator.step()

The printed outputs are :

[ 2.22044605e-16 -2.22044605e-16  1.00000000e+00  0.00000000e+00
  1.00000000e+00  4.93038066e-32 -2.22044605e-16  0.00000000e+00
  0.00000000e+00  1.00000000e+00  2.22044605e-16 -0.00000000e+00
 -0.00000000e+00  0.00000000e+00  0.00000000e+00  1.00000000e+00]

[ 2.22044605e-16 -2.22044605e-16  1.00000000e+00  0.00000000e+00
  1.00000000e+00  4.93038066e-32 -2.22044605e-16  0.00000000e+00
  0.00000000e+00  1.00000000e+00  2.22044605e-16 -0.00000000e+00
 -0.00000000e+00  0.00000000e+00  0.00000000e+00  1.00000000e+00]

[ 2.22044605e-16 -2.22044605e-16  1.00000000e+00  0.00000000e+00
  1.00000000e+00  4.93038066e-32 -2.22044605e-16  0.00000000e+00
  0.00000000e+00  1.00000000e+00  2.22044605e-16 -0.00000000e+00
 -2.22044605e-16  2.22044605e-16 -1.00000000e+00  1.00000000e+00]

[ 2.22044605e-16 -2.22044605e-16  1.00000000e+00  0.00000000e+00
  1.00000000e+00  4.93038066e-32 -2.22044605e-16  0.00000000e+00
  0.00000000e+00  1.00000000e+00  2.22044605e-16 -0.00000000e+00
 -1.00000000e+00 -4.93038066e-32  2.22044605e-16  1.00000000e+00]

The two first cameraViewTransform are still the same, and the matrices seems a bit weird to me. I was expecting something more in the form :

[ 1.00000000e+00  0.00000000e+00  0.00000000e+00  1.00000000e+00
  0.00000000e+00  1.00000000e+00  0.00000000e+00  0.00000000e+00
  0.00000000e+00  0.00000000e+00  1.00000000e+00  0.00000000e+00
  0.00000000e+00  0.00000000e+00  0.00000000e+00  1.00000000e+00]

for example

ahaidu · May 4, 2023, 10:03am

Would the inverse provide the world transform:

cam_world_to_local = cameraViewTransform.reshape(4, 4)
cam_local_to_world = np.linalg.inv(cam_world_to_local)

Leopold_M · May 4, 2023, 11:49am

Even the inverse didn’t seems to provide the world transform :

[[ 2.22044605e-16  1.00000000e+00  0.00000000e+00  0.00000000e+00]
 [-2.22044605e-16  4.93038066e-32  1.00000000e+00  0.00000000e+00]
 [ 1.00000000e+00 -2.22044605e-16  2.22044605e-16  0.00000000e+00]
 [ 0.00000000e+00  0.00000000e+00  0.00000000e+00  1.00000000e+00]]

[[ 2.22044605e-16  1.00000000e+00  0.00000000e+00  0.00000000e+00]
 [-2.22044605e-16  4.93038066e-32  1.00000000e+00  0.00000000e+00]
 [ 1.00000000e+00 -2.22044605e-16  2.22044605e-16  0.00000000e+00]
 [ 0.00000000e+00  0.00000000e+00  0.00000000e+00  1.00000000e+00]]

[[ 2.22044605e-16  1.00000000e+00  0.00000000e+00  0.00000000e+00]
 [-2.22044605e-16  4.93038066e-32  1.00000000e+00  0.00000000e+00]
 [ 1.00000000e+00 -2.22044605e-16  2.22044605e-16  0.00000000e+00]
 [ 1.00000000e+00  0.00000000e+00  0.00000000e+00  1.00000000e+00]]

[[ 2.22044605e-16  1.00000000e+00  0.00000000e+00  0.00000000e+00]
 [-2.22044605e-16  4.93038066e-32  1.00000000e+00  0.00000000e+00]
 [ 1.00000000e+00 -2.22044605e-16  2.22044605e-16  0.00000000e+00]
 [ 0.00000000e+00  1.00000000e+00  0.00000000e+00  1.00000000e+00]]

The poses that are supposed to be used are : poses = [(0, 0, 0), (1., 0, 0), (0, 1., 0), (0, 0, 1.)]. And I don’t ave these results in the matrices above

ahaidu · May 4, 2023, 12:20pm

Until a solution is found, or until this is fixed in the new release, I would suggest to access the data directly from the annotator:

this should give you more control on accessing the data in stage as well.

Leopold_M · May 4, 2023, 1:25pm

Thank you for your help. I will access the data through the annotator for the moment.
It is fine for the camera, but if I randomize the position of another object in the scene (a cube for example) and want to save its placement at each frame of the replicator at the same time than the captured image, is it still possible ?

ahaidu · May 4, 2023, 2:33pm

Yes, with the annotator you would call either world.step() or orchestrator.step() (not both) to feed the annotator with new data. Before, or after, calling step() you have full control on the simulation or reading/modifying the stage.

Let me know if this does not work for your specific scenario.

ahaidu · May 21, 2023, 6:22pm

UPDATE manual world.render() calls might be required to sync the render data with the simulation stage: Problem with images I get from cameras - #3 by alempereur

eliabntt94 · May 22, 2023, 9:16am

Hi @Leopold_M

From my experience, if you move something in IsaacSim you need to render twice to see the effects.

As for your question, I used the synthetic data helper to save those data.
I just partially updated my repo to the 2022 version.

In practice, I’ve created an extension that you can load and will save all the data from your viewport in the correct format. I also solved some bugs that affect the synthetic data helper.

In my experience, the matrices are transposed and scaled (thus you take the last row, multiply by the meters_per_unit factor, and transpose).

Here is the code for the custom extension GRADE-RR/extension_custom.py at v2022 · eliabntt/GRADE-RR · GitHub, here is where I set up the recorder in the code GRADE-RR/paper_simulation.py at v2022 · eliabntt/GRADE-RR · GitHub. Then with GRADE-RR/paper_simulation.py at v2022 · eliabntt/GRADE-RR · GitHub (my_recorder.counter += 1) you increase the index of the image and with .update() you can record the image and all the data. This will work for all the viewports (unless you change the code) and for most of the data that you might want (optical flow is still tricky).

The pose of the objects and of the camera are saved using poses and camera here. Camera pose will be saved with this snipped (edited to get the correct vfov) and the pose of the objects will be obtained here (edited to get the correct pose irrespective of the kind of animations that you use, some are not reported by the default method)

In case you need, I can support you there.

Topic		Replies	Views
Blurry Camera Images Isaac Sim	19	581	October 30, 2024
Execute Function everytime a new scene gets generated by Replicator Isaac Sim	18	1503	April 6, 2023
Getting Camera Position Synthetic Data Generation (SDG) python	5	1200	June 23, 2023
Time to move camera using replicator is increasing as it keeps beeing called Isaac Sim	8	656	August 26, 2024
Issue with the Augmentation documentation using replicator Isaac Sim synthetic-data , synthetic-data-generation , isaac-sim-v4-2-0	5	29	November 19, 2024
Replicator - how to get cameraViewTransform Synthetic Data Generation (SDG) synthetic-data	9	1784	October 19, 2023
External Extensions: OpenXR compact binding for creating extended reality applications Isaac Sim vr , external-extensions	42	4771	April 5, 2024
Capture frames from camera via Python scripting Isaac Sim python , synthetic-data	36	6590	April 5, 2024
Unable to register Pose Annotator with Replicator and Omniverse Code Synthetic Data Generation (SDG) isaacsim	1	1113	July 24, 2023
Replicator Camera Orientation Synthetic Data Generation (SDG) camera	6	1397	May 1, 2024

Save camera position and rotation with Writer for offline dataset generation

Related topics