How to get frame buffer pointer of rendered frame?

Isaac Sim Version

4.5.0
4.2.0
4.1.0
4.0.0
4.5.0
2023.1.1
2023.1.0-hotfix.1
Other (please specify):

Operating System

Ubuntu 22.04
Ubuntu 20.04
Windows 11
Windows 10
Other (please specify):

GPU Information

  • Model: NVIDIA GeForce RTX 4090
  • Driver Version: 550.120

Topic Description

How to get the direct frame buffer address for the frame that is being rendered?

Detailed Description

I’m trying to build a hardware-in-the-loop simulation system for a robot, which streams rendered frames of Isaac Sim cameras via HDMI/DisplayPort to a customed FPGA board which then sends frames to a robot controller.

From the documents, tutorials and forums, I learned:

  1. There are some APIs to obtain rendered frames to buffer, such as capture_frames_to_buffer() or rep_annotators.get_data(device=“cuda”), or capture_next_frame_texture()/
  2. Also, in some very hidden webpages (which cost me a lot of time to locate), there are some low level APIs or OmniGraph nodes including Sd On Frame (Sd Render Var Display Texture — Omniverse Kit) , Cam Cfa2x2 Encode Task Node, the parameters of which may contain some low level pointers.
    But I didn’t find tutorials or example scripts on how to use them.

So I want to learn that is there any APIs:

  1. to directly obtain the pointer/memory address of the back buffer in which the frame is being rendered, and/or
  2. to control the swapchain to swap the rendered frame directly to the front buffer, so the frames can be sent via HDMI?
    Or are there any examples for using the APIs I mentioned above?

I searched for a long time in this forum, in Isaac Sim/Omniverse/Extension/RTX renderer documents and found no solution.

Hope to get help from you.

Hello,

I am not sure that it is useful for you.

For me, I use the API capture_viewport_to_buffer(). The relative code is below:

vp_api = get_viewport_from_window_name("Viewport face")
capture_viewport_to_buffer(vp_api, viewport_face_capture)
def viewport_face_capture(buffer, buffer_size, width, height, format):
    capture_face = buffer2numpy(buffer, buffer_size, width, height)
    # you can change to adapt to your need
    captures_face.append(capture_face)
def buffer2numpy(self, buffer, buffer_size, width, height):
    try:
        ctypes.pythonapi.PyCapsule_GetPointer.restype = ctypes.POINTER(
            ctypes.c_byte * buffer_size)
        ctypes.pythonapi.PyCapsule_GetPointer.argtypes = [ctypes.py_object,
                                                          ctypes.c_char_p]
        content = ctypes.pythonapi.PyCapsule_GetPointer(buffer, None)
    except Exception as e:
        carb.log_error(f"Failed to capture viewport: {e}")
        return
    img = Image.frombuffer('RGBA', (width, height), content.contents)
    img_np = np.array(img)
    return img_np[:, :, :3]

1 Like

Thank you for your suggestion!

I want to know if capture_viewport_to_buffer() capture the image to a GPU buffer or through CPU to local memory?

Actually I hope to get the rendered image directly from GPU buffer and copy to another GPU buffer for transimitting via HDMI. From what I learned, the copy between GPU and CPU may introduce latency.

Also interested in this. I am currently grabbing a camera buffer by creating a viewport, setting the camera to that viewport and using:

capture_viewport_to_buffer(viewport, self.on_viewport_captured)

    def on_viewport_captured(self, buffer, buffer_size, width, height, format):
        """Handles viewport capture and updates UI with image data."""
        size = (width, height)
        try:
            ctypes.pythonapi.PyCapsule_GetPointer.restype = ctypes.POINTER(ctypes.c_byte * buffer_size)
            content = ctypes.pythonapi.PyCapsule_GetPointer(buffer, None)
            img = Image.frombytes("RGBA", size, content.contents)
            np_data = np.asarray(img).data

            self._dynamic_texture.set_data_array(np_data, img.size)
            print ('capture complete')
        except Exception as e:
            carb.log_error(f"Capture failed: {e}")

but I would rally like to grab the camera view directly, without creating a viewport. Please reply here if you find a way to do this.

Could we perhaps use the synthetic data pipeline?

I haven’t found a solution yet.

This synthetic data pipeline is what I tried before. But unfortunately I couldn’t build a complete pipeline (Didn’t find a coresponding node to provide inputs for this node).

If you succeed, could you please share the omnigraph?

Thanks very much!

Hi Nicolas- have you resolved this yet? What’s the expected loop rate for your system? It’s possible that a GPU-CPU frame copy will be fast enough to meet your timing requirements. Depending on your camera, it might even be faster than the capture rate of your real hardware.

Regardless of the underlying memory handling you may want to do some timing checks to see that the performance is sufficient. Then, optimize for GPU-GPU transfer only if the performance is not high enough.