Pytorch 1.2 cuda 10.0 vs. pytorch 1.9 cuda 11.1 significant slowdown

alex.spivakovsky · July 7, 2021, 3:42pm

Hello,
I recently upgraded my pytorch from 1.2 (CUDA 10.0) to the latest 1.9 (CUDA 11.1).

However using the same RTX 2080 TI I observe massive slowdown in inference (the same code is used). Using pytorch autograd profiler I see

Pytorch 1.2, CUDA 10.0
convolution 0.74% 386.450us 8.93% 4.690ms 72.161us 9.91% 9.902ms 152.342us 65
_convolution 1.57% 824.964us 8.20% 4.304ms 66.216us 9.60% 9.592ms 147.563us 65
conv2d 0.72% 377.362us 8.82% 4.632ms 75.928us 9.10% 9.086ms 148.953us 61

Pytorch 1.9, CUDA 11.1
aten::convolution 1.33% 850.084us 17.73% 11.327ms 174.265us 270.592us 0.34% 26.337ms 405.189us 65
aten::_convolution 2.03% 1.295ms 16.40% 10.477ms 161.187us 321.695us 0.40% 26.067ms 401.026us 65
aten::conv2d 1.22% 781.662us 17.38% 11.105ms 182.043us 247.232us 0.31% 25.043ms 410.548us

As can be clearly seen, there is almost 3 times slowdown with the new CUDA.

Looking for suggestions and answers,

Thank you,
Alex

Robert_Crovella · July 7, 2021, 3:45pm

The primary library being used by Pytorch is cuDNN. There is a separate forum for cuDNN questions. You may wish to ask your question there. I can move it there if you like.

alex.spivakovsky · July 7, 2021, 3:52pm

Could you please move it?

Thank you,
Alex.

AakankshaS · July 8, 2021, 5:43am

Hi @alex.spivakovsky ,
Can you help us with the mode, reproducible script, and detailed logs, so that we can assist you better.

Thanks!

Topic		Replies	Views
CUDA 11.1 vs CUDA 10.0 significant slowdown CUDA-GDB cuda , pytorch	1	827	July 7, 2021
train speed slow after upgrade cuda CUDA Programming and Performance	1	906	November 14, 2019
CUDNN enabled increase runtime by 25% cuDNN cuda , pytorch	2	877	July 6, 2023
Tensorrt is slower than pytorch TensorRT	2	2378	September 15, 2021
Tesla V100 GPU way too slow CUDA Programming and Performance	8	6828	December 21, 2017
cuDNN error: CUDNN_STATUS_EXECUTION_FAILED Cuda 10.1 with Pytorch cuDNN	1	1728	July 3, 2019
PyTorch 2.4 and below: 10x slowdown on NVIDIA GB10 (sm_121) due to missing kernel support Deep Learning (Training & Inference) cuda , pytorch	0	450	October 25, 2025
CUDA gets slower when upgrading the version CUDA Programming and Performance cuda , pytorch	2	1332	August 17, 2023
The inference time of Deconvolution in tensorrt is slower than pytorch Triton Inference Server (archived) tensorrt	0	835	April 15, 2020
Why is cuda10.0 able to work with pytorch1.7? TensorRT cuda , pytorch	2	496	February 2, 2021

Pytorch 1.2 cuda 10.0 vs. pytorch 1.9 cuda 11.1 significant slowdown

Related topics