Hi. I’m using cudnn for dilated convolution.
I use cudnnGetConvolutionForwardAlgorithm() and cudnnGetConvolutionForwardWorkspaceSize() and got
algo CUDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_PRECOMP_GEMM and workspace size 0.
The efficiency seems to be low compared to caffe, which implements dilated convolution via cublas gemm.
For a better performance, How can I improve my cudnn dilated convolution? Or should I switch to gemm? Thank you.
BTW, I’m using titanxp with cuda9.0 and cudnn7.4.1