Incresement of work_space allocation using cudnn v7.6.x to do convolution with FP16 data type

Recently, I found that for the same convolution operation, using float as data type would allocate much less work_space than using FP16 as data type, even using same convolution algorithm. How could I using tensor core and FP16 as data type for convolution operation with less work_space allocation ?

Hi,

What cudnn convolution fwd preference you are using in this case?
Please find below link for your reference:
https://docs.nvidia.com/deeplearning/sdk/cudnn-archived/cudnn_765/cudnn-api/index.html#cudnnConvolutionFwdPreference_t

Thanks