I’m currently using TensorRT to accelerate deep learning inference on Jetson TX2. The application scenario is a classical computer vision problem (basically classification, localization and segmentation). The common pipeline for implementing these applications is to load the pre-trained models, then do fine-tuning or transfer learning. Models usually come from common frameworks like PyTorch or Tensorflow.
But parsing issues often occur when parse these pretrained models to tensorrt. I’m currently parsing the pretrained models from tensorflow. Following the procedure of sample code, i’ve firstly loaded pretrained models and export to uff models:
model = tf.keras.applications.InceptionV3(include_top = True)
model.load_weights('<.h5 file>')
# According to sample code save the model and convert to uff file
def save(model, filename):
output_names = model.output.op.name
sess = tf.keras.backend.get_session()
frozen_graph = tf.graph_util.convert_variables_to_constants(sess, sess.graph.as_graph_def(), [output_names])
uff.from_tensorflow(graphdef=frozen_graph,
output_filename=filename,
output_nodes=[output_names],
text=True)
Then write a script to parse uff model and serialize it for further use:
ICudaEngine* UFFParser(const char* uff_file, int maxBatchsize, IUffParser* parser)
{
IBuilder* builder = createInferBuilder(gLogger);
INetworkDefinition* network = builder->createNetwork();
if(!parser->parse(uff_file, *network, DataType::kFLOAT))
{
std::cout << "Fail to parse" << std::endl;
exit(-1);
}
builder->setMaxBatchSize(maxBatchSize);
builder->setMaxWorkspaceSize(8 << 30);
ICudaEngine* engine = builder->buildCudaEngine(*network);
if(!engine)
{
std::cout << "Unable to create engine" << std::endl;
exit(-1);
}
return engine;
}
// ....
int main()
{
auto parser = createUffParser();
parser->registerInput(input_name, Dims3{3, 224, 224}, UffInputOrder::kNCHW);
parser->registerOutput(output_name);
ICudaEngine* engine = UFFParser(uff_file, 2, parser);
// Do other stuff
return 0;
}
But parsing may fail even for basic models like resnet50. I’ve tried several models: vgg16, vgg19, resnet50, inceptionV3, densenet121.
Parsing failed for all models except vgg16 and vgg19.
For example resnet50:
ERROR: UFFParser: Validator error: bn5c_branch2a/cond/Switch: Unsupported operation _Switch
I know there’s a list showing which operations are supported currently https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html#support_op. And also we can use IPlugin to extend custom layers to help parsing unsupported operations.
I’m wondering do you have some model zoo where models in it can be directly parsed, so that we can use these pretrained models to fit in industry application? In this case, a pipeline can be built so that no further effort is needed once built.
Or for the moment, we can only try to tackle problems for specific models, for example, if we are going to inference resnet50-based model using tenosrrt and parsing error occurs, developers need to put effort to solve these issues(either replicate model using API or use IPlugin to help parsing). In this case, solution cannot be generalized to new models since different operations unsupport may occur again.