Fully connected layer using cuDNN library

Hi there,

I’m building a neural network with 2 convolution layers and 2 fully connected layer for one of my application. I would like to use cuDNN library API calls to perform these operations. I was able to find cuDNN APIs for convolution layers but I could not get any for fully connected layers. Is there any such APIs? Or should I manually convert outputs of conv layer to right format and feed them to SGEMM?

Thanks in advance.

You don’t need to use another library. Although the performance might be better with another library. Just set up the convolution with the weights being the same size as the input parameters. So, if you had a 4d NCHW tensor of dims of [4,1,28,28]. Then you would set the 2d convolution to have a slide of [1,1] dilation of [1,1] and a padding of [0,0]. The filter dims will be [x,1,28,28]. The output of the convolution will be [4,x,1,1]. The next convolution will have the same settings. This time though your filter will be [y,x,1,1]. Then after that same convolution settings. Your filter will be [z,y,1,1]. So on and so forth.

Recap.

Convolution 2D settings will always be: slide [1,1], padding [0,0], dilation [1,1].

PL == previous layer

Filter NCHW dims will be: [ (# of channels), (# of channels PL), (H of PL), (W of PL)].

Hi,

In case someone comes around having the same question as I did.

On my test hardware (RTX2070), using cudnn convolutions instead of cublas gemm is approx. 50% slower for a fully connected layer.

This chapter talk a bit about using gemm for fully connected layers:

https://docs.nvidia.com/deeplearning/sdk/dl-performance-guide/index.html#fullyconnected-layer

I’ve implemented it here https://github.com/andoma/saga/blob/master/src/fc.cpp (fp32 and fp16 variant)