Changing the GPU Clock Speed does not affect inference time. Any idea why this is happening?

Hello,

I am running YOLOv3-tiny on TensorFlow 2 on the Jetson Nano 2GB.

I am messing around with the clock speeds and I realized that lowering the CPU clock speed slows down the inference time. But when I lower the GPU clock speed and keep the CPU clock speed at the highest, the inference time is not affected.

The obvious conclusion would be that TensorFlow 2 is running on CPU but I ran all the tests to see if it is able to use the GPU and the answer is yes, TensorFlow 2 says that GPU is available.

>>> import tensorflow as tf
>>> print(tf.config.list_physical_devices('GPU'))
2023-09-26 13:27:28.163009: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1019] ARM64 does not support NUMA - returning NUMA node zero
2023-09-26 13:27:28.744743: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1019] ARM64 does not support NUMA - returning NUMA node zero
2023-09-26 13:27:28.745110: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1019] ARM64 does not support NUMA - returning NUMA node zero
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

>>> print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))
Num GPUs Available:  1

>>> print(tf.test.gpu_device_name())
2023-09-26 13:31:18.044226: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1019] ARM64 does not support NUMA - returning NUMA node zero
2023-09-26 13:31:18.044621: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1019] ARM64 does not support NUMA - returning NUMA node zero
2023-09-26 13:31:18.044898: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1019] ARM64 does not support NUMA - returning NUMA node zero
2023-09-26 13:31:18.045373: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1019] ARM64 does not support NUMA - returning NUMA node zero
2023-09-26 13:31:18.045693: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1019] ARM64 does not support NUMA - returning NUMA node zero
2023-09-26 13:31:18.045850: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /device:GPU:0 with 59 MB memory:  -> device: 0, name: NVIDIA Tegra X1, pci bus id: 0000:00:00.0, compute capability: 5.3
/device:GPU:0

Even when I run inference in a loop, I can see the GPU usage increase in jtop but I don’t understand how decreasing the GPU clock speed does not affect inference. I used the following tutorial:

Does anyone have any idea why this is happening?

Hi,

It’s possible that your model is memory-bound so CPU is utilized for data transfer.

Have you checked the GPU utilization in tegrastats?
How much GPU resources are occupied?

$ sudo tegrastats

Thanks.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.