I was able to use TF-TRT to run my model already but the issue with TF-TRT is every time I load the model it takes extremely long and according to this post it seems it’s not gonna be solved any time soon and it’s recommended to use pure trt. But when search docs online I couldn’t find any post about converting tf2 SavedModel to uff. The existing uff convertor only supports frozen graph format which has been deprecated since tf 2.0. Can anyone provide some guide on this? Thanks
You can convert the TF model into onnx format via tf2onnx first.
And the onnx model can be supported by the TensorRT.
You can get the performance data with our trtexec binary directly.
/usr/src/tensorrt/bin/trtexec --onnx=[your/model] --fp16 # use --int8 for INT mode
I used the tf2onnx tool and converted to onnx model. Then I used the trtexec but it’s complaining this:
[11/05/2020-22:01:15] [W] [TRT] onnx2trt_utils.cpp:220: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
ERROR: builtin_op_importers.cpp:2042 In function importMatMul:
 Assertion failed: inputA->getType() != nvinfer1::DataType::kINT32 && inputB->getType() != nvinfer1::DataType::kINT32 && “TensorRT doesn’t support INT32 inputs for MatMul!”
It seems trt’s MatMul operation doesn’t support INT32. But this trtexec automatically casted the INT64 to INT32. I didn’t see any options for the tf2onnx to export to fp32 something yet. Do you have any suggestion here? I think MatMul operator is very common operator right? shouldn’t be beyond support?
You can modify the ONNX model with our GraphSurgeon API below:
Thanks! this is very helpful!
Hi I’m not able to figure out how to use graph surgeon to change the type of a node and I didn’t see a proper example in the source code. Are you able to point me to some resources? thanks.
Sorry for the late update.
Please check the details in the following comment: