Higher Resource Consumption on Ampere architecture vs Turing architecture

aabuzrara · February 20, 2022, 11:36am

Hi,
I have a machine with an nvidia RTX 8000 that i used to run certain AI models, (Resnet50 with Pytorch, AZURE Custom vision exported as onnx, and others). And i recently got a new machine that has an RTX A4000 GPU. I have noticed a big diffrence in resource consumption, Both on CPU-RAM and GPU-RAM

as an example, One ONNX model on the RTX 8000 machines consumes 1.8GB of CPU-RAM and 711 MB GPU-RAM just to load a single image and run inference on it continously, if i run the exact same code with same model on the RTX A4000 machine it consumes 3.2 GB CPU-RAM and 1.49 GB of GPU-RAM, which is a huge diffrence.

I understand that these two cards are different arch (Turing vs Ampere) but i am suprissed to see such a huge difference. at first i thought it could be cuda versioning, but i have tried different combinations of Cuda and Cudnn and the result is the same.

Do you guys have any idea what could be the casue and if there is a way to solve it?

Thanks

Topic		Replies	Views
Increased GPU memory footprint with Ampere architecture TensorRT	1	448	December 20, 2022
Higher memory usage Ampere compared to Pascal CUDA Programming and Performance onnx	1	999	December 27, 2021
Memory exception CUDA Programming and Performance cuda	2	29	July 24, 2024
Vram consumption varies by computer environment cuDNN cuda , docker , deep-learning	1	648	August 24, 2021
A huge difference of memory usage on different GPU Frameworks	1	629	July 21, 2021
A10 GPU using more GPU RAM than T4 GPU for inference using PyTorch TensorRT model CUDA NVCC Compiler tensorrt , camera , cuda , ubuntu	0	1118	December 2, 2021
Video memory consumption larger on more recent Quadro generations Linux cuda , encoder	0	376	May 6, 2022
GPU vs CPU deep learning memory usage Jetson Nano cudnn	5	725	March 26, 2024
GPU power is maxout then inference is running with tensorrt TensorRT	4	463	December 19, 2022
Discrepancy when profiling GPU memory utilization CUDA Programming and Performance	0	533	December 4, 2018

Higher Resource Consumption on Ampere architecture vs Turing architecture

Related topics