How to get frame buffer pointer of rendered frame?

Isaac Sim Version

4.5.0
4.2.0
4.1.0
4.0.0
4.5.0
2023.1.1
2023.1.0-hotfix.1
Other (please specify):

Operating System

Ubuntu 22.04
Ubuntu 20.04
Windows 11
Windows 10
Other (please specify):

GPU Information

  • Model: NVIDIA GeForce RTX 4090
  • Driver Version: 550.120

Topic Description

How to get the direct frame buffer address for the frame that is being rendered?

Detailed Description

I’m trying to build a hardware-in-the-loop simulation system for a robot, which streams rendered frames of Isaac Sim cameras via HDMI/DisplayPort to a customed FPGA board which then sends frames to a robot controller.

From the documents, tutorials and forums, I learned:

  1. There are some APIs to obtain rendered frames to buffer, such as capture_frames_to_buffer() or rep_annotators.get_data(device=“cuda”), or capture_next_frame_texture()/
  2. Also, in some very hidden webpages (which cost me a lot of time to locate), there are some low level APIs or OmniGraph nodes including Sd On Frame (Sd Render Var Display Texture — Omniverse Kit) , Cam Cfa2x2 Encode Task Node, the parameters of which may contain some low level pointers.
    But I didn’t find tutorials or example scripts on how to use them.

So I want to learn that is there any APIs:

  1. to directly obtain the pointer/memory address of the back buffer in which the frame is being rendered, and/or
  2. to control the swapchain to swap the rendered frame directly to the front buffer, so the frames can be sent via HDMI?
    Or are there any examples for using the APIs I mentioned above?

I searched for a long time in this forum, in Isaac Sim/Omniverse/Extension/RTX renderer documents and found no solution.

Hope to get help from you.

Hello,

I am not sure that it is useful for you.

For me, I use the API capture_viewport_to_buffer(). The relative code is below:

vp_api = get_viewport_from_window_name("Viewport face")
capture_viewport_to_buffer(vp_api, viewport_face_capture)
def viewport_face_capture(buffer, buffer_size, width, height, format):
    capture_face = buffer2numpy(buffer, buffer_size, width, height)
    # you can change to adapt to your need
    captures_face.append(capture_face)
def buffer2numpy(self, buffer, buffer_size, width, height):
    try:
        ctypes.pythonapi.PyCapsule_GetPointer.restype = ctypes.POINTER(
            ctypes.c_byte * buffer_size)
        ctypes.pythonapi.PyCapsule_GetPointer.argtypes = [ctypes.py_object,
                                                          ctypes.c_char_p]
        content = ctypes.pythonapi.PyCapsule_GetPointer(buffer, None)
    except Exception as e:
        carb.log_error(f"Failed to capture viewport: {e}")
        return
    img = Image.frombuffer('RGBA', (width, height), content.contents)
    img_np = np.array(img)
    return img_np[:, :, :3]

1 Like

Thank you for your suggestion!

I want to know if capture_viewport_to_buffer() capture the image to a GPU buffer or through CPU to local memory?

Actually I hope to get the rendered image directly from GPU buffer and copy to another GPU buffer for transimitting via HDMI. From what I learned, the copy between GPU and CPU may introduce latency.

Also interested in this. I am currently grabbing a camera buffer by creating a viewport, setting the camera to that viewport and using:

capture_viewport_to_buffer(viewport, self.on_viewport_captured)

    def on_viewport_captured(self, buffer, buffer_size, width, height, format):
        """Handles viewport capture and updates UI with image data."""
        size = (width, height)
        try:
            ctypes.pythonapi.PyCapsule_GetPointer.restype = ctypes.POINTER(ctypes.c_byte * buffer_size)
            content = ctypes.pythonapi.PyCapsule_GetPointer(buffer, None)
            img = Image.frombytes("RGBA", size, content.contents)
            np_data = np.asarray(img).data

            self._dynamic_texture.set_data_array(np_data, img.size)
            print ('capture complete')
        except Exception as e:
            carb.log_error(f"Capture failed: {e}")

but I would rally like to grab the camera view directly, without creating a viewport. Please reply here if you find a way to do this.

Could we perhaps use the synthetic data pipeline?

I haven’t found a solution yet.

This synthetic data pipeline is what I tried before. But unfortunately I couldn’t build a complete pipeline (Didn’t find a coresponding node to provide inputs for this node).

If you succeed, could you please share the omnigraph?

Thanks very much!