TensorRT6 OnnxParser could not support dynamic shape.

[TensorRT] ERROR: Parameter check failed at: …/builder/Network.cpp::addInput::671, condition: isValidDims(dims, hasImplicitBatchDimension())

#!/bin/bash                                                                                                                                                         
                                                                                                                                                                    
INPUTs=input_images:0                                                                                                                                               
OUTPUTs=feature_fusion/Conv_11/Sigmoid:0,feature_fusion/Conv_12/Sigmoid:0                                                                                           
                                                                                                                                                                    
PB_PATH=./models/resnet50_30w_nchw_no_is_training.pb                                                                                                                
ONNX_PATH=./models/resnet50_30w_nchw_no_is_training.onnx                                                                                                            
                                                                                                                                                                    
python3 -m tf2onnx.convert \                                                                                                                                        
    --input $PB_PATH \                                                                                                                                              
    --output $ONNX_PATH \                                                                                                                                           
    --inputs $INPUTs \                                                                                                                                              
    --outputs $OUTPUTs \                                                                                                                                            
    --fold_const \                                                                                                                                                  
    --opset 10 \                                                                                                                                                    
    --verbose
import os                                                                                                                                                           
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'                                                                                                                            
                                                                                                                                                                    
import warnings                                                                                                                                                     
warnings.filterwarnings(action='ignore', category=FutureWarning)                                                                                                    
                                                                                                                                                                    
#-------------------------------------------------------------------------------                                                                                    
                                                                                                                                                                    
import tensorrt as trt                                                                                                                                              
import common                                                                                                                                                       
from get_data import ModelData                                                                                                                                      
                                                                                                                                                                    
################################################################################                                                                                    
                                                                                                                                                                    
TRT_LOGGER = trt.Logger(trt.Logger.VERBOSE)                                                                                                                         
PB_PATH = './models/resnet50_30w_nchw_no_is_training.pb'                                                                                                            
ONNX_PATH = './models/resnet50_30w_nchw_no_is_training.onnx'                                                                                                        
ENGINE_PATH = './models/resnet50_30w_nchw_no_is_training.engine'                                                                                                    
                                                                                                                                                                    
################################################################################                                                                                    
                                                                                                                                                                    
if __name__ == '__main__':                                                                                                                                          
    flag = 1 << int(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)                                                                                               
    with trt.Builder(TRT_LOGGER) as builder, \                                                                                                                      
            builder.create_network(flag) as network, \                                                                                                              
            builder.create_builder_config() as config, \                                                                                                            
            trt.OnnxParser(network, TRT_LOGGER) as parser:                                                                                                          
        # inp = network.add_input(                                                                                                                                  
        #         name=ModelData.INPUT_NAME,                                                                                                                        
        #         dtype=ModelData.INPUT_DTYPE,                                                                                                                      
        #         shape=ModelData.INPUT_SHAPE)                                                                                                                      
        with open(ONNX_PATH, 'rb') as f: parser.parse(f.read())                                                                                                     
                                                                                                                                                                    
        # builder.max_batch_size = ModelData.MAX_BATCH                                                                                                              
        # builder.max_workspace_size = common.GiB(100)                                                                                                              
                                                                                                                                                                    
        # profile = builder.create_optimization_profile()                                                                                                           
        # profile.set_shape(                                                                                                                                        
        #         ModelData.INPUT_NAME,                                                                                                                             
        #         ModelData.MIN_INPUT_SHAPE,                                                                                                                        
        #         ModelData.OPT_INPUT_SHAPE,                                                                                                                        
        #         ModelData.MAX_INPUT_SHAPE)                                                                                                                        
        # config.add_optimization_profile(profile)                                                                                                                  
                                                                                                                                                                    
        # engine = builder.build_engine(network, config)                                                                                                            
        # with open(ENGINE_PATH, 'wb') as f: f.write(engine.serialize())
import os                                                                                                                                                           
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'                                                                                                                            
                                                                                                                                                                    
import warnings                                                                                                                                                     
warnings.filterwarnings(action='ignore', category=FutureWarning)                                                                                                    
                                                                                                                                                                    
#-------------------------------------------------------------------------------                                                                                    
                                                                                                                                                                    
import tensorrt as trt                                                                                                                                              
                                                                                                                                                                    
################################################################                                                                                                    
                                                                                                                                                                    
class ModelData:                                                                                                                                                    
    MIN_BATCH = 1                                                                                                                                                   
    OPT_BATCH = 1                                                                                                                                                   
    MAX_BATCH = 1                                                                                                                                                   
                                                                                                                                                                    
    MIN_SIDE = 320                                                                                                                                                  
    OPT_SIDE = 640                                                                                                                                                  
    MAX_SIDE = 1200                                                                                                                                                 
                                                                                                                                                                    
    INPUT_NAME = 'input_images'                                                                                                                                     
    INPUT_DTYPE = trt.float32                                                                                                                                       
    INPUT_SHAPE = (-1, -1, -1, 3)                                                                                                                                   
                                                                                                                                                                    
    MIN_INPUT_SHAPE = (MIN_BATCH, MIN_SIDE, MIN_SIDE, 3)                                                                                                            
    OPT_INPUT_SHAPE = (OPT_BATCH, OPT_SIDE, OPT_SIDE, 3)                                                                                                            
    MAX_INPUT_SHAPE = (MAX_BATCH, MAX_SIDE, MAX_SIDE, 3)

Hi,

Optimization profiles currently only support networks with an explicit batch dimension. It appears that your network has an implicit batch dimension, which is causing the error you’re getting when trying to TensorRT API’s add_input() method.

I believe you will need to go back to your Tensorflow code that creates “resnet50_30w_nchw_no_is_training.pb”, and adjust the input to have an explicit batch dimension of size -1 – Developer Guide :: NVIDIA Deep Learning TensorRT Documentation.

With dynamic batch size (explicit -1 dimension), I think you have to define at least one optimization profile in order to use the network in TensorRT. This is mentioned in the docs –
Developer Guide :: NVIDIA Deep Learning TensorRT Documentation – “When using runtime dimensions, you must create at least one optimizations profile at build time.”

For an implicit batch network, there is currently no optimization profile support. TensorRT optimizes for the max_batch_size parameter, but allows any batch size as input.

Thanks,
NVIDIA Enterprise Support

@NVES_R

The network already has an explicit batch dimention as the following *.pbtxt said.
So I support the problem is not caused by this reason ?

node {
  name: "input_images"                                                                                                                                              
  op: "Placeholder"                                                                                                                                                 
  attr {                                                                                                                                                            
    key: "dtype"                                                                                                                                                    
    value {                                                                                                                                                         
      type: DT_FLOAT                                                                                                                                                
    }                                                                                                                                                               
  }                                                                                                                                                                 
  attr {                                                                                                                                                            
    key: "shape"                                                                                                                                                    
    value {                                                                                                                                                         
      shape {                                                                                                                                                       
        dim {                                                                                                                                                       
          size: -1                                                                                                                                                  
        }                                                                                                                                                           
        dim {                                                                                                                                                       
          size: -1                                                                                                                                                  
        }                                                                                                                                                           
        dim {                                                                                                                                                       
          size: -1                                                                                                                                                  
        }                                                                                                                                                           
        dim {                                                                                                                                                       
          size: 3                                                                                                                                                   
        }                                                                                                                                                           
      }                                                                                                                                                             
    }                                                                                                                                                               
  }                                                                                                                                                                 
}

@NVES_R

Interesting, I’m not sure off the top of my head then.

Can you send me the “/models/resnet50_30w_nchw_no_is_training.pb” and whatever else I might need that you’re referencing in this code so I can repro and further look into it?

Also, “@NVES_R” doesn’t actually send me a notification or anything, devtalk doesn’t support that feature like GitHub.

Here is a simple example that could reproduce the above problem.

# Author: Jiarenyf ...

import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'

import warnings
warnings.filterwarnings(action='ignore', category=FutureWarning)

#---------------------------------------------------------------------

import tensorrt as trt
import tensorflow as tf

#---------------------------------------------------------------------

TRT_LOGGER = trt.Logger(trt.Logger.VERBOSE)

INP_NAME = 'input'
OUT_NAME = 'output'

PB_PATH = '/dev/shm/debug_onnx.pb'
ONNX_PATH = '/dev/shm/debug_onnx.onnx'
ENGINE_PATH = '/dev/shm/debug_onnx.engine'

######################################################################

if __name__ == '__main__':
    with tf.Graph().as_default() as graph:
        t = tf.placeholder(tf.float32, shape=(None,None,None,3), name=INP_NAME)
        w = tf.shape(t)[1] // 2
        h = tf.shape(t)[2] // 2
        ns = tf.stack([h, w])
        t = tf.image.resize_bilinear(t, ns)
        t = tf.nn.relu6(t, name=OUT_NAME)
        tf.io.write_graph(graph, '.', PB_PATH, as_text=False)
        tf.io.write_graph(graph, '.', f'{PB_PATH}txt', as_text=True)
        os.system(
            f"python3 -m tf2onnx.convert "
            f"--input {PB_PATH} "
            f"--output {ONNX_PATH} "
            f"--inputs {INP_NAME}:0 "
            f"--outputs {OUT_NAME}:0 "
            f"--fold_const "
            f"--opset 9 "
            f"--verbose "
        )
    assert os.path.exists(ONNX_PATH)

#---------------------------------------------------------------------

    print('\n\n\nError Here !!!')
    flag = 1 << int(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)
    with trt.Builder(TRT_LOGGER) as builder, \
            builder.create_network(flag) as network, \
            builder.create_builder_config() as config, \
            trt.OnnxParser(network, TRT_LOGGER) as parser:
        with open(ONNX_PATH, 'rb') as f: parser.parse(f.read())

Thanks for the code, I was able to reproduce but am not quite sure as to why that’s happening.

I’ve escalated this issue to the engineering team and will let you know when they get back to me.

Any feed back ?

They’re still looking into this one.

Any feed back till now?

Just heard back.

There are two ways to specify the resized dimensions - either through the scales or sizes input into the resize node.

In TensorRT 6 (https://docs.nvidia.com/deeplearning/sdk/tensorrt-api/c_api/classnvinfer1_1_1_i_resize_layer.html) as well as tentatively in the next release, we expect the scales to be a constant.

In your TF graph and in the resulting generated ONNX graph - the resize scales are dynamic - so TensorRT cannot handle this case.

From the script:

w = tf.shape(t)[1] // 2  
h = tf.shape(t)[2] // 2

These values are dynamic since the shape of t is [-1,-1,-1, 3) and TensorRT cannot handle a resize node with dynamic scales.

TRT does support dynamic resizes given an expected output shape. ONNX opset 11 supports this case, so if there is a way to generate an ONNX graph with a resize node with a dynamic resize shape instead of dynamic scales from TF that would be the only viable work around for this at the moment.

This is the official specification for the resize node in ONNX (in opset 11) https://github.com/onnx/onnx/blob/master/docs/Operators.md#Resize

However, I believe the current TensorRT release only supports up to opset 10.