Higher memory usage Ampere compared to Pascal

Hi, I got higher memory usage in ampere compared to pascal which tested a classifier onnx model (Resnet50)[models/resnet50-v1-12.onnx at master · onnx/models · GitHub] by using onnxruntime inference.


  • Ubuntu: 20.04
  • CUDA: 11.1
  • CUDNN: 8.0.5
  • Onnxruntime: 1.8.1


  • CPU : Intel(R) Core™ i7-8750H CPU @ 2.20GHz, (6C/12T)
  • RAM: 32 GB
  • GPU: NVIDIA GeForce GTX 1060 with Max-Q Design (6GB)


  • CPU: AMD EPYC 7V12 64-Core
  • RAM:886 GB
  • GPU: A100-SXM4 (40GB)


RAM (GB) GPU Memory (GB) Latency (ms)
801 523 8.2


RAM (GB) GPU Memory (GB) Latency (ms)
1838 1441 3.3

The result shows that Server’s GPU Memory has 2.7x higher than Laptop. Server’s RAM has 2.3x higher than Laptop. Please could you advise why this is occurring, thanks!

The title of the post seems misleading. More accurate would be: “Different memory usage observed on two totally different hardware platforms”.

You would want to check with the ResNet50 developers as to why that is. There may be a forum or a mailing list for answering questions ResNet users have about the product. My hypothesis is that ResNet uses more resources on systems that have more resources available in order to speed up processing. But I have never used ResNet.