TensorRT 3: parser error for VGG16 with INT8

I’m trying to make an inference engine from Tensorflow model using an example from
https://devblogs.nvidia.com/int8-inference-autonomous-vehicles-tensorrt/

The differences from the above example are:

  • Converted a model from TF instead of Caffe, took a simple VGG16 from keras.applications
  • Channel-last order and different layer names
  • resize to WIDTH=HEIGHT=224 for calibrator and input stream
  • It keeps throwing the following error:

    [TensorRT] INFO: Detecting Framework
    [TensorRT] INFO: Parsing Model from uff
    [TensorRT] INFO: UFFParser: parsing input_1
    [TensorRT] INFO: UFFParser: parsing block1_conv1/kernel
    [TensorRT] INFO: UFFParser: parsing block1_conv1/Conv2D
    [TensorRT] ERROR: UFFParser: Parser error: block1_conv1/Conv2D: Invalid weights types when converted
    [TensorRT] ERROR: Failed to parse UFF File
    

    The weight type of this operation in tensorflow graph is D_FLOAT.

    UFF model is obtained as follows:

    model = tf.keras.applications.VGG16(include_top=True, weights='imagenet')
    model_input = model.input.name.strip(':0')
    model_output = model.output.name.strip(':0')
    
    graph = tf.get_default_graph().as_graph_def()
    sess = tf.keras.backend.get_session()
    frozen_graph = tf.graph_util.convert_variables_to_constants(sess, graph, [model_output])
    frozen_graph = tf.graph_util.remove_training_nodes(frozen_graph)
    
    uff_model = uff.from_tensorflow(frozen_graph, [model_output])
    with open('VGG16.uff', 'wb') as dump:
        dump.write(uff_model)
    

    The engine is created like this (this fails):

    batchstream = calibrator.ImageBatchStream(5, calibration_files, sub_mean_hwc)
    int8_calibrator = calibrator.PythonEntropyCalibrator(["input_1"], batchstream)
    engine = trt.lite.Engine(framework="uff",
                                 path="VGG16.uff",
                                 # stream=uff_model,
                                 max_batch_size=1,
                                 max_workspace_size=(256 << 20),
                                 input_nodes={"input_1": (HEIGHT,WIDTH,CHANNEL)},
                                 output_nodes=["predictions/Softmax"],
                                 preprocessors={"input_1": sub_mean_hwc},
                                 postprocessors={},
                                 data_type=trt.infer.DataType.INT8,
                                 calibrator=int8_calibrator,
                                 logger_severity=trt.infer.LogSeverity.INFO)
    

    Using framework=‘tf’ and passing a model graph as a stream makes no difference. I have Ubuntu 16.04, GTX 1060, CUDA9.0, CuDNN 7.0.5, tensorflow 1.5.0-rc0.

    Please help.

    Hi,

    I have the exact same problem, but with a differen network and a different aproach on creating the engine. We use a custom TensorFlow network. It works perfectly fine with different datatypes like FLOAT and HALF. The error message is the same:

    [TensorRT] INFO: UFFParser: parsing input_grey
    [TensorRT] INFO: UFFParser: parsing Inputs/grey/Reshape/shape
    [TensorRT] INFO: UFFParser: parsing Inputs/grey/Reshape
    [TensorRT] INFO: UFFParser: parsing block1g/conv2d/kernel
    [TensorRT] INFO: UFFParser: parsing block1g/conv2d/Conv2D
    [TensorRT] ERROR: UFFParser: Parser error: block1g/conv2d/Conv2D: Invalid weights types when converted
    

    My code for creating the engine looks like this:

    # initialize Int8 Calibrator
    calibrator = PythonEntropyCalibrator(mycontroller, input_node_names, sess)
    
    # Build TensorRT inference engine
    logger = trt.infer.ConsoleLogger(trt.infer.LogSeverity.INFO)
    engine = trt.utils.uff_to_trt_engine(logger,
                                         uff_model,
                                         parser,
                                         metrics['batch_size'],
                                         1 << 20,
                                         trt.infer.DataType.INT8,
                                         calibrator=calibrator)
    

    Please help, we really want to use TensorRT with INT8.

    Because the way you use uff_to_trt_engine is not supported INT8 calibrator as parameter, try to use tensorrt.lite.engine instead.

    But in my original message there was already tensorrt.lite.Engine

    We created a new “Deep Learning Training and Inference” section in Devtalk to improve the experience for deep learning and accelerated computing, and HPC users:
    https://devtalk.nvidia.com/default/board/301/deep-learning-training-and-inference-/

    We are moving active deep learning threads to the new section.

    URLs for topics will not change with the re-categorization. So your bookmarks and links will continue to work as earlier.

    -Siddharth