Which algo should be passed for cudnnConvolutionForward() when TensorCore and NHWC ?

Which algo of cudnnConvolutionForward() is supported
when ‘Tensor Core operations’ and ‘CUDNN_TENSOR_NHWC’ ?

according to ‘2.7. Tensor Core Operations’ in cuDNN User Guide,
cudnnConvolutionForward() should be called using CUDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_PRECOMP_GEMM.

according to ‘4.47. cudnnConvolutionForward’ in cuDNN User Guide,
when wDesc is in CUDNN_TENSOR_NHWC format the only algo supported is CUDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_GEMM.

I tested.

IMPLICIT_GEMM causes ‘Segmentation fault’.
IMPLICIT_PRECOMP_GEMM ends normally, the result value is correct and it is twice faster than FP32.
I think it is no choice but IMPLICIT_PRECOMP_GEMM.

IMPLICIT_PRECOMP_GEMM ends normally in FP32 mode, too.