Loading image to GPU with pytorch very slow

I have a Realsense camera that I am reading depth frames from in order to run inference on them with a tensorrt model. Before I run the frame through the model I need to convert in from a numpy array to a tensor and load it to the GPU, I do it like this:

img0 = np.asanyarray(c.colorize(DEPTH).get_data())
img = torch.tensor(img0, device=torch.device('cuda'))

but this operation is very slow for some reason, taking between 0.01 and 0.5 seconds.
What I tried to do:

  1. Load a tensor before starting to read images, with torch.zeros(1, device=torch.device('cuda')), sometimes it helped but not by much. Also tried to run it ~50 times, same result
  2. ran sudo jetson_clocks, did not help either
  3. I also checked jtop while running the code and the used RAM went up to 4 GB immediately, I’m not sure if that’s normal. Also on the GPU tab I saw no change, is it supposed to go up?

Anyone got any suggestions what else I can try?

Thank you in advance.


Before running the jetson_clocks, have you set the device to power mode first?
It should look like this:

$ sudo nvpmodel -m 0
$ sudo jetson_clocks


Yes, I did both of those things:

Screenshot 2022-09-01 102242

Didn’t really make a difference


Could you monitor the device status with the following command as well?

$ sudo tegrastats

In general, it’s expected the GPU utilization reach 99% for optimal performance.


This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.