TensorRT NCHW vs cuDNN NHWC

Hi everybody, I have a question regarding tensor memory layout in TensorRT. Some frameworks use a NCHW layout, others use NHWC layout.
Reading cuDNN documentation it seems 2d convolutions on tensor cores achieve best performance with NHWC memory layout. This thread though explicitly states that TensorRT implementation is NCHW.
Since I have tensor cores available, is there a way in which my CNN can take advantage of the extra performance provided by the NHWC memory layout? Is it automatically taken care of by Tensor RT?

Hi @albecenz,

Yes it automatically taken care by TRT,
TRT internally tries all kinds of tensor layouts.

Thank you.

Thanks

1 Like

@spolisetty

I beleive Reformat (like NCHW to NHWC) is still a cost for large input, so is it better to convert your model to NHWC first before feeding to tensorrt converter? But to the best of my knowledge, I have not seen any onnx model with NHWC format. Do you have any idea?