Incresement of work_space allocation using cudnn v7.6.x to do convolution with FP16 data type

Recently, I found that for the same convolution operation, using float as data type would allocate much less work_space than using FP16 as data type, even using same convolution algorithm. How could I using tensor core and FP16 as data type for convolution operation with less work_space allocation ?


What cudnn convolution fwd preference you are using in this case?
Please find below link for your reference: