Higher Energy Consumption in the First Epoch of Neural Network Training on Jetson Orin

While training a Neural Network model on a Jetson Orin module, I observed that the energy consumption during the first epoch is noticeably higher compared to subsequent epochs. Here’s in below table from my training logs, indicating Time_total and Energy_total:

Epoch Time_total Energy_total
0 0.1355 1767414.685
1 0.0161 124208.9431
2 0.0158 119931.8558
3 0.0152 115094.0329

This pattern emerges despite a constant model architecture and training methodology. I’m using Python, PyTorch, and operating on a Linux Ubuntu system.

Could the community shed light on why this might be happening? Is it a common occurrence or specific to neural networks? Any insights would be greatly appreciated.

Hi @mr.jahani, if you were to break it down even further, my bet is that you would find PyTorch taking extra time on the first training steps of the first epoch as it initializes, loads kernels, allocates memory for tensors, ect. This is a normal occurrence with PyTorch and hence PyTorch benchmarks would typically have a “warmup” period where the timings of the first runs are discarded.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.