TensorRT's nvinfer1::INetworkDefinition::addFullyConnected() does not work as expected for C3D network

But I never saw memory occupation increased very much when I did inference with my this network implemented with TensorRT API. I saw an about 700M more memory was occupied if I forcibly added a call on cudnnxxx() in the inference code of my this network, just like video-caffe does as it calls cudnnxxx() in its convolution layers.
I observed the memory increasement by jtop or by adding code suggested here :