What should be the standard pipeline for developing using TensorRT

I’m currently using TensorRT to accelerate deep learning inference on Jetson TX2. The application scenario is a classical computer vision problem (basically classification, localization and segmentation). The common pipeline for implementing these applications is to load the pre-trained models, then do fine-tuning or transfer learning. Models usually come from common frameworks like PyTorch or Tensorflow.

But parsing issues often occur when parse these pretrained models to tensorrt. I’m currently parsing the pretrained models from tensorflow. Following the procedure of sample code, i’ve firstly loaded pretrained models and export to uff models:

model = tf.keras.applications.InceptionV3(include_top = True)
model.load_weights('<.h5 file>')
# According to sample code save the model and convert to uff file
def save(model, filename):
    output_names = model.output.op.name
	sess = tf.keras.backend.get_session()
    frozen_graph = tf.graph_util.convert_variables_to_constants(sess, sess.graph.as_graph_def(), [output_names])
    uff.from_tensorflow(graphdef=frozen_graph,
                        output_filename=filename,
                        output_nodes=[output_names],
                        text=True)

Then write a script to parse uff model and serialize it for further use:

ICudaEngine* UFFParser(const char* uff_file, int maxBatchsize, IUffParser* parser)
{
    IBuilder* builder = createInferBuilder(gLogger);
    INetworkDefinition* network = builder->createNetwork();
    
    if(!parser->parse(uff_file, *network, DataType::kFLOAT))
    {
        std::cout << "Fail to parse" << std::endl;
        exit(-1);
    }
    
    builder->setMaxBatchSize(maxBatchSize);
    builder->setMaxWorkspaceSize(8 << 30);
    ICudaEngine* engine = builder->buildCudaEngine(*network);
    if(!engine)
    {
        std::cout << "Unable to create engine" << std::endl;
        exit(-1);
    }
    return engine;
}
// ....
int main()
{
    auto parser = createUffParser();
    parser->registerInput(input_name, Dims3{3, 224, 224}, UffInputOrder::kNCHW);
    parser->registerOutput(output_name);
    ICudaEngine* engine = UFFParser(uff_file, 2, parser);
    // Do other stuff
    return 0;
}

But parsing may fail even for basic models like resnet50. I’ve tried several models: vgg16, vgg19, resnet50, inceptionV3, densenet121.
Parsing failed for all models except vgg16 and vgg19.

For example resnet50:

ERROR: UFFParser: Validator error: bn5c_branch2a/cond/Switch: Unsupported operation _Switch

I know there’s a list showing which operations are supported currently https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html#support_op. And also we can use IPlugin to extend custom layers to help parsing unsupported operations.

I’m wondering do you have some model zoo where models in it can be directly parsed, so that we can use these pretrained models to fit in industry application? In this case, a pipeline can be built so that no further effort is needed once built.
Or for the moment, we can only try to tackle problems for specific models, for example, if we are going to inference resnet50-based model using tenosrrt and parsing error occurs, developers need to put effort to solve these issues(either replicate model using API or use IPlugin to help parsing). In this case, solution cannot be generalized to new models since different operations unsupport may occur again.

Hello,

Thank you for your feedback. I’ve communicated your message to our engineering team. Your input is important to us. Will keep you updated.

In the meantime, TRT does support common object detection models (SSD, YOLO, Faster-RCNN), and have samples and plugins for all of them.

regards
NVIDIA Enterprise Support

I have fixed the parsing issues with these keras-pretrained models(At least for those that i have tried: resnet50, inceptionV3, vgg, densenet). We need to refine a setting of keras backend:

K.set_learning_phase(0)

Thanks to answer here https://devtalk.nvidia.com/default/topic/1038905/tensorrt/tensorrt-error-unsupport-opperation/post/5286712/#5286712. Then use the standard uff exporting code as written above (In TensorRT5 it’s a bit different procedure, haven’t tried that yet.)
Right now the parsing succeeds for models i mentioned.

Thank you a lot, i’ve looking forward for your feedback!