Cudnn8.0.4 Convolution Occupy High Memory

After upgrade from my caffe from cudnn7.5(cuda10) to cudnn8.0.4(cuda11), the whole training and infer gpu memory become more higher, and the only change is that i fix the cudnnConvolutionFwdAlgo_t to CUDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_GEMM to avoid alloc workspace.
This is the source file that I only change, please give some suggestion where is wrong?Thanks.
sourcecode.zip (10.0 KB)

Hi @840302039,
CUDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_GEMM does not require any workspace.
For training there are also backward paths. If you only changed forward algo to algo0, then it is likely the backward paths require a lot more workspace than 7.5
There are also memory needed for auto tuning.

Thanks!

Hi,
You can isolate the problem by checking the API logs. There will be workspace sizes required for each operation in the API logs, something like
workSpaceSizeInBytes: type=size_t; val=1288;
For how to generate API logs, see https://docs.nvidia.com/deeplearning/cudnn/developer-guide/index.html#api-logging

Thanks!

Thanks for your suggestion. I turn on the API logs, and logs tells that, all workspaceSizeInBytes are 0. By the way, I have set three algo as below.
fwd_algo_[i].algo = CUDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_GEMM;
bwd_filter_algo_[i].algo = CUDNN_CONVOLUTION_BWD_FILTER_ALGO_0;
bwd_data_algo_[i].algo = CUDNN_CONVOLUTION_BWD_DATA_ALGO_0;

So I think maybe CUDNN 8 occupy memory more than CUDNN7.5.