Confusion over get_camera_image Output

Hello. I’m trying to use the output from the get_camera_image function to feed into a CV model. I found that the resulting numpy array is 2D with size [camera_height, camera_width x channels]. I assume these images are RGBA since there were 4 channels. Therefore I used np.split to try to reformat this array into 3D so imageai would know what to do with it. I experimented with grabbing only 3 of the channels as I want RGB rather than RGBA for the model. However, when I would save the 3D array as an image using Pillow I noticed the image looked shifted and the colors were not what I was expecting leading me to believe that I perhaps didn’t grab the right part of the array for each channel. Is there any additional documentation on how the output of get_camera_image is structured to give me a better idea of how to reformat it to pass it on to other image related tools?

Hi @michaela.buchanan

Currently, I am using the following code to reshape the images in the skrl’s utils module. I hope this is helpful…

# get image
image = gym.get_camera_image(sim, 
# for camera_type == gymapi.IMAGE_COLOR
image = image.reshape(image.shape[0], -1, 4)[..., :3]
# for camera_type == gymapi.IMAGE_DEPTH
image = image.reshape(image.shape[0], -1)  

That did the trick for me. Thank you for the quick response!

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.