I have a Realsense camera that I am reading depth frames from in order to run inference on them with a tensorrt model. Before I run the frame through the model I need to convert in from a numpy array to a tensor and load it to the GPU, I do it like this:
img0 = np.asanyarray(c.colorize(DEPTH).get_data())
img = torch.tensor(img0, device=torch.device('cuda'))
but this operation is very slow for some reason, taking between 0.01 and 0.5 seconds.
What I tried to do:
- Load a tensor before starting to read images, with
torch.zeros(1, device=torch.device('cuda'))
, sometimes it helped but not by much. Also tried to run it ~50 times, same result - ran
sudo jetson_clocks
, did not help either - I also checked
jtop
while running the code and the used RAM went up to 4 GB immediately, I’m not sure if that’s normal. Also on the GPU tab I saw no change, is it supposed to go up?
Anyone got any suggestions what else I can try?
Thank you in advance.