What should be the standard pipeline for developing using TensorRT

eddiesyn20123950 · November 28, 2018, 1:31pm

I’m currently using TensorRT to accelerate deep learning inference on Jetson TX2. The application scenario is a classical computer vision problem (basically classification, localization and segmentation). The common pipeline for implementing these applications is to load the pre-trained models, then do fine-tuning or transfer learning. Models usually come from common frameworks like PyTorch or Tensorflow.

But parsing issues often occur when parse these pretrained models to tensorrt. I’m currently parsing the pretrained models from tensorflow. Following the procedure of sample code, i’ve firstly loaded pretrained models and export to uff models:

model = tf.keras.applications.InceptionV3(include_top = True)
model.load_weights('<.h5 file>')
# According to sample code save the model and convert to uff file
def save(model, filename):
    output_names = model.output.op.name
	sess = tf.keras.backend.get_session()
    frozen_graph = tf.graph_util.convert_variables_to_constants(sess, sess.graph.as_graph_def(), [output_names])
    uff.from_tensorflow(graphdef=frozen_graph,
                        output_filename=filename,
                        output_nodes=[output_names],
                        text=True)

Then write a script to parse uff model and serialize it for further use:

ICudaEngine* UFFParser(const char* uff_file, int maxBatchsize, IUffParser* parser)
{
    IBuilder* builder = createInferBuilder(gLogger);
    INetworkDefinition* network = builder->createNetwork();
    
    if(!parser->parse(uff_file, *network, DataType::kFLOAT))
    {
        std::cout << "Fail to parse" << std::endl;
        exit(-1);
    }
    
    builder->setMaxBatchSize(maxBatchSize);
    builder->setMaxWorkspaceSize(8 << 30);
    ICudaEngine* engine = builder->buildCudaEngine(*network);
    if(!engine)
    {
        std::cout << "Unable to create engine" << std::endl;
        exit(-1);
    }
    return engine;
}
// ....
int main()
{
    auto parser = createUffParser();
    parser->registerInput(input_name, Dims3{3, 224, 224}, UffInputOrder::kNCHW);
    parser->registerOutput(output_name);
    ICudaEngine* engine = UFFParser(uff_file, 2, parser);
    // Do other stuff
    return 0;
}

But parsing may fail even for basic models like resnet50. I’ve tried several models: vgg16, vgg19, resnet50, inceptionV3, densenet121.
Parsing failed for all models except vgg16 and vgg19.

For example resnet50:

ERROR: UFFParser: Validator error: bn5c_branch2a/cond/Switch: Unsupported operation _Switch

I know there’s a list showing which operations are supported currently https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html#support_op. And also we can use IPlugin to extend custom layers to help parsing unsupported operations.

I’m wondering do you have some model zoo where models in it can be directly parsed, so that we can use these pretrained models to fit in industry application? In this case, a pipeline can be built so that no further effort is needed once built.
Or for the moment, we can only try to tackle problems for specific models, for example, if we are going to inference resnet50-based model using tenosrrt and parsing error occurs, developers need to put effort to solve these issues(either replicate model using API or use IPlugin to help parsing). In this case, solution cannot be generalized to new models since different operations unsupport may occur again.

NVES · November 28, 2018, 4:08pm

Hello,

Thank you for your feedback. I’ve communicated your message to our engineering team. Your input is important to us. Will keep you updated.

In the meantime, TRT does support common object detection models (SSD, YOLO, Faster-RCNN), and have samples and plugins for all of them.

regards
NVIDIA Enterprise Support

eddiesyn20123950 · November 29, 2018, 2:30pm

I have fixed the parsing issues with these keras-pretrained models(At least for those that i have tried: resnet50, inceptionV3, vgg, densenet). We need to refine a setting of keras backend:

K.set_learning_phase(0)

Thanks to answer here https://devtalk.nvidia.com/default/topic/1038905/tensorrt/tensorrt-error-unsupport-opperation/post/5286712/#5286712. Then use the standard uff exporting code as written above (In TensorRT5 it’s a bit different procedure, haven’t tried that yet.)
Right now the parsing succeeds for models i mentioned.

eddiesyn20123950 · November 29, 2018, 2:32pm

Thank you a lot, i’ve looking forward for your feedback!

Topic		Replies	Views
Inference with Keras Pretrained Models TensorRT	7	1678	January 7, 2019
Retraining Custom MobilenetV2-fn and use on TensorRT TensorRT	10	730	November 29, 2020
Tensorrt support for SSD_inception trained on custom dataset TensorRT	15	2806	October 12, 2021
TensorRT optimization of Keras model on Jetson TX2 TensorRT	3	1751	August 8, 2018
Building SSD based model in TensorRT using UFF parser TensorRT	0	503	August 21, 2019
Issue regarding converting tensorflow file to tensorRT TensorRT	2	687	October 12, 2021
Error while converting tf grapf to tensorrt (UFFParser: Parser error: Maximum: Unsupported binary op max with constant right) TensorRT tensorrt , tensorflow	4	676	January 4, 2021
[TensorRT5.0] Parsing the converted UFF format InceptionV3 model failed, which is converted from TF frozen model TensorRT	4	1081	July 5, 2019
Unable to convert tensorflow models master maskRcnn graph to tensorRT TensorRT	6	664	January 15, 2021
Unable to deploy tensorflow image classification model Jetson TX2	7	912	October 18, 2021

What should be the standard pipeline for developing using TensorRT

Related topics