Can we make TensorRT handle nhwc Tensor?

yaoyaowd · May 26, 2020, 11:52pm

I’m evaluating TensorRT on a VGG like model and my input is nchw.

However, I noticed that TensorRT will transform my model to nhwc for faster inference.

Since my model is Tensorflow, can we directly use nhwc as input so that we don’t need an input reformatter in TensorRT?

Input reformatter is very slow when input is large:
conv1_1_input/Conv2D + (Unnamed Layer* 2) [Activation] input reformatter 0 0.55792
conv1_1_input/Conv2D + (Unnamed Layer* 2) [Activation] 0.98768

NHWC tensor is faster than NCHW tensor, to perform a 32x32x3x3 conv on a tensor of size 1,32,300,1680
NCHW + FP32: 3ms on 2070.
NHWC + FP32: 1.9ms on 2070.

Therefore, can we add NHWC support in TensorRT directly?

SunilJB · May 27, 2020, 4:53am

Hi,

NHWC is supported in TRT.

Using NHWC will have some benefit according to device you use. In general:

with tensor cores, NHWC is (generally) preferred.
for non-tensor-core, NCHW is preferred.

Thanks

yaoyaowd · May 27, 2020, 4:48pm

Maybe I miss something, but in that page I only see NHWC8. Also there’s NHWC for plugins but I didn’t see that we can directly pass in a NHWC tensor to a convolution layer.

It also didn’t mention NHWC for ITensor
https://docs.nvidia.com/deeplearning/tensorrt/api/c_api/namespacenvinfer1.html#ad26d48b3a534843e9990ab7f903d34a7

Since we pass ITensor to IConvolutionLayer, is there anyway that I can let TensorRT know it is NHWC?

SunilJB · May 28, 2020, 6:17am

TensorFormat instructs TensorRT builder how to interpret the input format.
Please refer to below links:
https://devblogs.nvidia.com/speeding-up-deep-learning-inference-using-tensorflow-onnx-and-tensorrt/

Thanks

yaoyaowd · May 28, 2020, 4:53pm

Can we go back to the original question about NHWC format support for conv layer since it is faster on the latest GPUs?

In your blog post there’s nothing about NHWC format.
In the link you provide it set the input to NCHW as well.
parser->registerInput(“Input_0”, DimsCHW(1, 28, 28), UffInputOrder::kNCHW);

When I look at tensorflow code

github.com

tensorflow/tensorflow/blob/master/tensorflow/compiler/tf2tensorrt/convert/convert_nodes.cc#L2170


      
            if (tensor.GetTrtDims().nbDims > 1) {
              return errors::InvalidArgument("Expected a 0D or 1D shape tensor");
            }
            return Status::OK();
          }
          
          
// Converts Reshape op if the input has dynamic (unknown) dims.
          Status ConvertDynamicReshape(const OpConverterParams* params) {
            if (params->use_implicit_batch) {
              return errors::InvalidArgument(
                  "The input \"shape\" for Reshape must be a constant in implicit batch"
                  " mode.");
            }
            if (!IS_TRT_VERSION_GE(7, 1, 3, 0)) {
              // While officially TRT supports shape value input , there are problems with
              // shape input handling that cause networks converted with
              // ConvertDynamicReshape fail. Here we conservatively switch off the
              // converter before TRT 7.1.3.
              return errors::InvalidArgument(
                  "Non constant shape input tensor for Reshape requires minimum TRT "
                  "7.1.3");

It transpose tensor from nhwc to nchw in order to use IConvLayer.
But I know that IConvLayer try to transpose it back in order to use TensorCore.

In this case, why not make TensorRT support nhwc for IConvLayer?

SunilJB · May 28, 2020, 5:26pm

If you look closely, Input shape used in this case is NHWC format
def build_engine(onnx_path, shape = [1,224,224,3])

Same approach can be used in UFF parse case to set input order as kNHWC
Please refer below link:
https://docs.nvidia.com/deeplearning/tensorrt/archives/tensorrt-710-ea/api/c_api/namespacenvuffparser.html#ae6a2a9503d69a32179de572c67b0b8be

Thanks

yaoyaowd · May 28, 2020, 5:37pm

Please, I’m wondering if we can make IConvLayer support NHWC input instead of NCHW input so that we can avoid any shuffle or reformatter when doing the convolution compute?

In the blog post, the shape is [1, 224, 224, 3], but if you look at tf2onnx code it was referring to, it does the transpose when convert tf code to onnx code.

github.com

onnx/tensorflow-onnx/blob/main/tf2onnx/onnx_opset/nn.py#L39


      
          
          
def is_channels_last(node):
              """Returns whether node is channels last, so (N, ..., C)."""
          
          
    return not node.data_format.startswith("NC")
          
          

          
def make_shape_channels_first(shape):
              """Makes a (N, ..., C) shape into (N, C, ...)."""
          
          
    return shape[:1] + shape[-1:] + shape[1:-1]
          
          

          
def make_shape_channels_last(shape):
              """Makes a (N, C, ...) shape into (N, ..., C)."""
          
          
    return shape[:1] + shape[1:-1] + shape[1:2]
          
          

          
def get_channels_first_permutation(spatial):
              """Returns a permutation to make a (N, ..., C) array into (N, C, ...)."""

I know UFF parser support kNHWC, but it will transpose to NCHW when pass to IConvLayer and it is the additional latency that we want to avoid. Do you get what the problem is?

SunilJB · June 4, 2020, 12:41pm

TensorRT uses NCHW uniformly when defining the semantics of operations. You can use the TensorFormat enum to gain access to TensorRT’s internal data layouts at network boundaries, which are optimized for TensorCore.

Thanks

Topic		Replies	Views
TensorRT 5 Input Tensor Format NCHW / NHWC TensorRT	6	4987	October 24, 2019
TRT 7 - NHWC support? TensorRT	4	631	October 12, 2021
Using nhwc format instead of nchw for deepstream DeepStream SDK tensorrt , gstreamer	13	1323	October 12, 2021
NHWC inputs/outputs TensorRT	5	787	July 21, 2021
TensorRT UFF Tensorflow NHWC (channels last) to NCHW (channels first) conversion buggy TensorRT	2	3192	November 2, 2019
Does TensorRT rewrite ONNX models to NHWC? TensorRT	11	1780	August 3, 2023
TensorRT support NHWC model? Jetson TX2	4	3181	October 18, 2021
TensorRT specify layer NCHW -> NHWC Jetson TX2	7	1990	October 18, 2021
TensorRT UFF parser register_input() cannot handle original graph in NHWC format TensorRT	7	2418	March 6, 2019
TensorRT 3: Faster TensorFlow Inference and Volta Support Technical Blog	16	462	December 8, 2020

Can we make TensorRT handle nhwc Tensor?

Related topics