Depthwise conv workspace size with cuDNN 7 Grouped Convolution

Hi there, I’m trying to implement depthwise convolution (forward) with cuDNN 7’s grouped convolution support. I have a convolution forward example that works by setting the output tensor descriptor with values from cudnnGetConvolution2dForwardOutputDim(), set convolutionGroupCount() = number of input tensor channels and allocates workspace memory with size = cudnnGetConvolutionForwardWorkspaceSize(). The algorithm I’m using is CUDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_PRECOMP_GEMM which should have the widest support.

With groupCount set to > 1, cudnnGetConvolution2dForwardOutputDim() fails unless I set the filter descriptor’s output channel to 1. But even if that passes, cudnnGetConvolutionForwardWorkspaceSize() always returns CUDNN_STATUS_BAD_PARAM. Without visibility into what’s causing the error, I’d like to know how are the input/output/filter descriptors set differently compared to a regular convolution.

P.S. I’m also setting bias descriptor for 2D convolution with cudnnSetTensor4dDescriptor() where n = 1, c = bias_vector.size and h = w = 1. Was wondering if that could be a possible cause of the error as well.

Thanks for the help!

1 Like

Could you please let us know if you are still facing this issue?