cuda::DeconvolutionLayer works slowly

iz314 · March 31, 2019, 2:12pm

Hi. I’m using trtexec with TRT 5 on Xavier to convert a caffe model into a TRT engine.

When I’m profiling the trtexec run (via Nsight), I see that the DeconvolutionLayer takes up much of the processing time, although there are no weights there - only upscaling (by a factor of 2, 4, etc.). Out of hundreds of convolutional layers, the deconvolution ones, of which I have only 2-4, take up 30-50% of the processing time.

Example for the relevant profiling line:
https://i.imgur.com/DxhGc3J.png

My deconvolution (upsample) layer in caffe looks like this:

layer {
  name: "123"
  type: "Deconvolution"
  bottom: "122"
  top: "123"
  convolution_param {
    num_output: 128
    bias_term: false
    kernel_size: 2
    group: 128
    stride: 2
  }
}

I don’t understand why a cudnnConvolutionBackwardData is even called (see image) - shouldn’t this be only forward?
How can I solve this issue?

Thanks.

cliu13 · May 27, 2019, 7:45am

hi do you have any solutions ?

I’m experiencing same problem here:
https://devtalk.nvidia.com/default/topic/1052490/tensorrt/tensorrt4-convtranspose-layer-very-slow-inference-speed-/

Topic		Replies	Views
Deconvolution Layer runs super slow in TensorRT TensorRT	1	1603	August 2, 2018
TensorRT - Deconvolution layer slow inference DriveWorks	7	1892	October 9, 2018
TensorRT 3 grouped deconvolution slower than non-grouped TensorRT	4	751	May 2, 2018
TensorRT4 ConvTranspose layer VERY SLOW inference speed. TensorRT	0	656	May 24, 2019
Whats the different between Deconvolution groups and deconvolutional layers? Jetson TX2	4	1655	October 18, 2021
Slow convolution speeds on TK1 Jetson TK1	2	587	February 2, 2018
Any updates on deconvolution times? TensorRT	0	1063	September 28, 2018
Get warning "Bias weights are not set yet" when converting caffe model with custom plugin TensorRT tensorrt	8	566	October 12, 2021
Cudnn may be slower? GPU-Accelerated Libraries	3	2642	September 28, 2015
The inference of [ Deconvolution + Other Operations ], for example [ Deconvolution + Convolution ] in tensorrt is slower than mxnet TensorRT	4	1115	May 18, 2020

cuda::DeconvolutionLayer works slowly

Related topics