Improve performance of Parallel Depth image in omni isaac gym

Hello Guys,

I am working with OIG trying to get some depth image observation in parallel . So far my method is extremely slow. I would really appreciate any advice on how to improve the performance using the new replicator API, or change the device in sd_helper to cuda

This is what I do now:

self.viewport = get_active_viewport_window("Viewport").viewport_api
self.viewport.resolution = (self._image_size,self._image_size)

[...]

for i in range(self._num_envs):
    self.viewport.set_active_camera("/World/envs/env_"+str(i)+"/carter/chassis_link/cam/Camera")     
    gt = self.sd_helper.get_groundtruth(
    [
      "rgb",
      "depthLinear",
    ],
      self.viewport,
    )    

    image_tensor = torch.tensor(gt["depthLinear"])
    self.stacked_images_tensor[i] = image_tensor