Is there a way to tell if a chosen convolution algorithm requires NCHW-NHWC conversions?
The heuristics return both tensor core and non-tensor core algorithms. I am using NHWC as default for FP16 mode but there are cases where a series of convolutions do not use tensor cores. In these cases, all the NHWC-NCHW conversions can be avoided by using a NCHW kernels between convolutions. To make these optimizations, a reliable method to identify when conversions take place is required.
The documentation offers hints when conversions could take place but relying on documentation is a maintenance headache (a more general forward-compatible API based solution is preferable). The documentation also doesn’t say with 100% certainty when conversion would take place; it only suggests that cuDNN “may” perform conversions.