Does TensorRT support conv3d with Tensor Core ?

15150676757 · March 6, 2020, 3:29am

hello,
I have a model with 3d operation, the main layer is conv3d. With nvporf, 90% inference time is cudnn::detail::implicit_convolveND_sgemm<float …>. After use fp16, little performance improvement. the main inference time is cudnn::detail::implicit_convolveND_sgemm<__half …>.

TensorRT Version is 7 With RTX2080TI, any suggestions? Thanks

15150676757 · March 10, 2020, 1:07am

Any reply or some advices? thanks.

2701018719 · March 10, 2020, 9:53am

I have the same question

15150676757 · March 11, 2020, 6:24am

@2701018719， Any advices or why ？

15150676757 · March 11, 2020, 6:30am

https://devtalk.nvidia.com/default/topic/1073005/tensorrt/make-all-tensorrt-optimizations-compatible-with-3d-convolution/

SunilJB · March 11, 2020, 6:49am

Hi,

Yes, TRT support 3d conv layer. Speed depends on lot of param like GPU type etc.
Kernel selection depends on layer parameters, we have fast kernels for some common used parameters, like 333 filter size.
Others will use a general default kernel implement, which might be slow.

Please refer below link:
https://docs.nvidia.com/deeplearning/sdk/tensorrt-archived/tensorrt-700/tensorrt-support-matrix/index.html#layers-precision-matrix

Also, could you please try using latest CUDA/cuDNN/TRT version?

Thanks

15150676757 · March 12, 2020, 9:04am

Hi, thank you for your reply.

I just using the latest CUDA/cuDNN/TRT version with CUDA 10.2， cuDNN 7.6.5 , TRT 7.0.0.11.

I analyzed the network time based on nvprof again. Network time is mainly concentrated in a conv3d layer with 3×3×3 filter size, 32 ngroups, 1 or 2 stride and 1 pading. The input(128x8x28x28, etc) and out(128x8x28x28, etc). The layer will perform implicit_convolveND_sgemm 32 times with fp16, and my model contains many such layer(33).

I tested the time consumption of part networks， with fp32 2.5ms, with fp16 2.8ms.

Any advices? Thanks

SunilJB · March 13, 2020, 3:46am

Hi,
3D group conv specific kernel is currently not supported in TRT7.
In TRT7 we will split group conv and call kernel for each group. In this case since you have 32 ngroups and we run 32 times conv, that’s might be causing the performance to drop.

Thanks

15150676757 · March 13, 2020, 5:43am

Hi, Thank you for your explanation.

I test the 3D group conv based on cuDNN 7.6.5. With nvprof, the kernel implicit_convolveND_sgemm still run 32 times. Does cuDnn supported the 3D group conv specific kernel ？

SunilJB · March 13, 2020, 5:48am

3D group conv specific kernel is currently not supported.

Thanks

15150676757 · March 13, 2020, 8:36am

ok, I got it, thanks.

damien.menigaux · April 15, 2021, 4:20pm

I’m very interested in this. Can you open a feature request so that this is added whenever nvidia’s dev have the time please ?

NVES · April 15, 2021, 4:37pm

Hi,
Below link might help you with your query, Kindly check below link for all 3d support layers:

Thanks!

damien.menigaux · April 26, 2021, 10:34am

This link isn’t up to date though. There is currently INT8 support for 3d convolutions, but not according to your link. Therefore I’m not sure I should consider if there is support in tensor cores for 3D grouped convolutions. In fact I’ve opened an issue that shows that there seems not to be, No speedup from Tensor cores on 3d architecture with groupped convolutions · Issue #1198 · NVIDIA/TensorRT · GitHub, even though tensor cores kernels exist for 3d conv.

Topic		Replies	Views
TensorRT 7 conv3d is not running on Tensor Cores TensorRT	7	1370	September 22, 2021
TensorRT 7 conv3d is not running on Tensor Cores Jetson Xavier NX tensorrt	16	1545	December 1, 2021
TensorRT 2x slower than Cudnn for single Conv2D (74 ms vs. 156 ms) TensorRT	6	825	February 5, 2021
Is there tensorcore kernel for 3D convolution? cuDNN	3	2197	December 30, 2019
TensorRT inference time much faster than cuDNN TensorRT	5	1660	February 22, 2022
Working with 1D in TensorRT TensorRT tensorrt	6	1687	September 15, 2022
Conv3D does not use Tensor Cores TensorRT tensorrt , cuda , cudnn	8	1074	October 23, 2020
Conv3D - Running it on Tensor Core - cuDNN cuDNN	6	1515	June 12, 2020
Is there tensorcore kernel for 3D convolution? Deep Learning (Training & Inference) mixed-precision	1	950	November 25, 2019
Tensorrt is slower than pytorch TensorRT	2	2246	September 15, 2021

Does TensorRT support conv3d with Tensor Core ?

Related topics