PLEASE HELP ME! TF-TRT DepthwiseConv2dNative mismatching NCHW/NHWC format ERROR!

Description

I’m trying to convert an official DNN model(GitHub - lzx551402/ASLFeat: Implementation of CVPR'20 paper - ASLFeat: Learning Local Features of Accurate Shape and Localization) to TRT, which takes a gray image(1-channel) with 480x480 size, but it shows an error as

case 1 ( when using TF-TRT)

with session.Session(graph=ops.Graph()) as sess: 
    # deserialize the frozen graph
    with tf.io.gfile.GFile("./frozen_model_v2.pb", "rb") as f:
        frozen_graph = tf.GraphDef()
        frozen_graph.ParseFromString(f.read())
        converter = trt.TrtGraphConverter(  input_graph_def=frozen_graph,
                                            nodes_blacklist=['kpts','descs','scores'],
                                            max_workspace_size_bytes=(11<32),
                                            precision_mode="FP16",
                                            maximum_cached_engines=100)
                                            #is_dynamic_op=True ) #output nodes
        trt_graph = converter.convert() 
        tf_inputs = tf.placeholder(tf.float32, [1,480,480,1], name='input') # define the input tensor 
        #print(tf_inputs.shape)
        #tf_inputs = tf.expand_dims(tf_inputs, 0) 
        x = np.random.sample([1,480, 480,1])
        # make a dataset from a numpy array
        dataset = tf.data.Dataset.from_tensor_slices(x) 
        output_node = tf.import_graph_def(trt_graph, input_map={"input": tf_inputs},
                                                return_elements=['kpts','descs','scores'])        
        sess.run(output_node, feed_dict={'input:0': dataset})

Dimensions must be equal, but are 480 and 1 for ‘import_3/depthwise_6’ (op: ‘DepthwiseConv2dNative’) with input shapes: [1,480,480,1], [3,3,1,1].

case 2 (when using convert-to-uff )

Warning: No conversion function registered for layer: BatchToSpaceND yet.
Converting depthwise_1/BatchToSpaceND as custom op: BatchToSpaceND
Warning: No conversion function registered for layer: FloorMod yet.
Converting depthwise_1/required_space_to_batch_paddings/mod_1 as custom op: FloorMod
Warning: No conversion function registered for layer: FloorMod yet.
Converting depthwise_1/required_space_to_batch_paddings/mod as custom op: FloorMod
Warning: No conversion function registered for layer: AddV2 yet.
Converting depthwise_1/required_space_to_batch_paddings/add_1 as custom op: AddV2
Warning: No conversion function registered for layer: AddV2 yet.
Converting depthwise_1/required_space_to_batch_paddings/add as custom op: AddV2
Traceback (most recent call last):
File “/usr/local/bin/convert-to-uff”, line 8, in
sys.exit(main())
File “/usr/local/lib/python3.6/dist-packages/uff/bin/convert_to_uff.py”, line 92, in main
debug_mode=args.debug
File “/usr/local/lib/python3.6/dist-packages/uff/converters/tensorflow/conversion_helpers.py”, line 229, in from_tensorflow_frozen_model
return from_tensorflow(graphdef, output_nodes, preprocessor, **kwargs)
File “/usr/local/lib/python3.6/dist-packages/uff/converters/tensorflow/conversion_helpers.py”, line 178, in from_tensorflow
debug_mode=debug_mode)
File “/usr/local/lib/python3.6/dist-packages/uff/converters/tensorflow/converter.py”, line 94, in convert_tf2uff_graph
uff_graph, input_replacements, debug_mode=debug_mode)
File “/usr/local/lib/python3.6/dist-packages/uff/converters/tensorflow/converter.py”, line 79, in convert_tf2uff_node
op, name, tf_node, inputs, uff_graph, tf_nodes=tf_nodes, debug_mode=debug_mode)
File “/usr/local/lib/python3.6/dist-packages/uff/converters/tensorflow/converter.py”, line 47, in convert_layer
return cls.registry_[op](name, tf_node, inputs, uff_graph, **kwargs)
File “/usr/local/lib/python3.6/dist-packages/uff/converters/tensorflow/converter_functions.py”, line 408, in convert_depthwise_conv2d_native
return _conv2d_helper(name, tf_node, inputs, uff_graph, func=“depthwise”, **kwargs)
File “/usr/local/lib/python3.6/dist-packages/uff/converters/tensorflow/converter_functions.py”, line 433, in _conv2d_helper
number_groups = int(wt.attr[‘value’].tensor.tensor_shape.dim[2].size)
IndexError: list index (2) out of range

Environment

TensorRT Version: I tried all of 7.0.0.11
GPU Type: V100, T4, RTX 2080 Ti
Nvidia Driver Version: >430
CUDA Version: 10.0
CUDNN Version: 7.6.0 or 7.6.5
Operating System + Version: ubuntu 18.04
Python Version (if applicable): 3.6.5
TensorFlow Version (if applicable): 1.15.2(source build)
PyTorch Version (if applicable): no
Baremetal or Container (if container which image + tag):

Relevant Files

Someone faced the same error and solved but he and you didn’t explained how.
TRT5.0: Error when creating UFF graph with DepthwiseConv2dNative

Do I Need to insert a tranpose node by graphsurgeon python api? How can I? It seems that the GS apis are useful only for when the network contains unsupported/custom layers which are not my case(Depthwise native 2D layer is support even from TRT 5!).

Steps To Reproduce

  1. download the attached pb frozen model file
    frozen_model_v2.pb (3.4 MB)
  2. run the code below ;
import tensorflow as tf 
from tensorflow.python.compiler.tensorrt import trt_convert as trt 
from tensorflow.python.framework import ops
from tensorflow.python.client import session
from tensorflow.core.framework import graph_pb2 
from tensorflow.python.framework import importer  
from tensorflow.python.platform import app 
from tensorflow.python.platform import gfile 
from tensorflow.python.summary import summary
import numpy as np
import cv2

inputs = 'input'
outputs = ['kpts','descs','scores']
 
with session.Session(graph=ops.Graph()) as sess: 
    # deserialize the frozen graph
    with tf.io.gfile.GFile("./frozen_model_v2.pb", "rb") as f:
        frozen_graph = tf.GraphDef()
        frozen_graph.ParseFromString(f.read())
        converter = trt.TrtGraphConverter(  input_graph_def=frozen_graph,
                                            nodes_blacklist=['kpts','descs','scores'],
                                            max_workspace_size_bytes=(11<32),
                                            precision_mode="FP16",
                                            maximum_cached_engines=100)
                                            #is_dynamic_op=True ) #output nodes
        trt_graph = converter.convert() 
        tf_inputs = tf.placeholder(tf.float32, [1,480,480,1], name='input') # define the input tensor 
        #print(tf_inputs.shape)
        #tf_inputs = tf.expand_dims(tf_inputs, 0) 
        x = np.random.sample([1,480, 480,1])
        # make a dataset from a numpy array
        dataset = tf.data.Dataset.from_tensor_slices(x) 
        output_node = tf.import_graph_def(trt_graph, input_map={"input": tf_inputs},
                                                return_elements=['kpts','descs','scores'])        
        sess.run(output_node, feed_dict={'input:0': dataset})

or just run “convert-to-uff frozen_model_v2.pb”

Please include: