TensorRT 3 grouped deconvolution slower than non-grouped

moodie · March 15, 2018, 4:01pm

With TensorRT 3.0.4, cudnn 7, and CUDA 9 I’ve found that a model using grouped deconvolutions is about twice as slow as the same model with non-grouped deconvolutions. I originally trained my model with grouped deconvolutions with mxnet and then padded the weights with zeroed weight values to approximate the same operation with non-grouped deconvolutions and found that the non-grouped model ran twice as fast. Is this expected? I assumed such an optimization would allow for significantly less operations since I’m just using this to bilinearly upsample my feature maps.

moodie · March 22, 2018, 2:56pm

I found another forum post indicating that grouped deconvolution in tensorrt is implemented as a single kernel invocation for each feature channel. This equivalently becomes several hundred kernel invocations per layer instead of one. Is there a timeline for a fix for this from nvidia?

andre.dubbel · April 5, 2018, 8:34am

I have the same issue as moodie. Would also appreciate a fix for this.

SiddharthSharma_TPM · April 26, 2018, 11:28pm

We created a new “Deep Learning Training and Inference” section in Devtalk to improve the experience for deep learning and accelerated computing, and HPC users:
https://devtalk.nvidia.com/default/board/301/deep-learning-training-and-inference-/

We are moving active deep learning threads to the new section.

URLs for topics will not change with the re-categorization. So your bookmarks and links will continue to work as earlier.

-Siddharth

mvillmow · May 2, 2018, 5:11am

Please file a bug here: https://developer.nvidia.com/nvidia-developer-program
Please include the steps/files used to reproduce the problem along with the output of infer_device.

Topic		Replies	Views
Whats the different between Deconvolution groups and deconvolutional layers? Jetson TX2	4	1724	October 18, 2021
TensorRT 3 RC and grouped convolutions TensorRT	6	3740	October 30, 2018
Deconvolution Layer runs super slow in TensorRT TensorRT	1	1638	August 2, 2018
The inference of [ Deconvolution + Other Operations ], for example [ Deconvolution + Convolution ] in tensorrt is slower than mxnet TensorRT	4	1172	May 18, 2020
TensorRT - Deconvolution layer slow inference DriveWorks	7	1972	October 9, 2018
The inference time of Deconvolution in tensorrt is slower than pytorch Triton Inference Server (archived) tensorrt	0	821	April 15, 2020
Any updates on deconvolution times? TensorRT	0	1081	September 28, 2018
Is tensorrt slow with group convolution? TensorRT	5	1402	June 7, 2021
TensorRT group convolution get wrong results TensorRT	5	586	November 25, 2021
Grouped Convolution for TensorRT GPU-Accelerated Libraries	0	960	July 13, 2017

TensorRT 3 grouped deconvolution slower than non-grouped

Related topics