Convert a Caffe model to TensorRT 5.0

In TensorRT 4.0 there has been a python function named tensorrt.utils.caffe_to_trt_engine
that allowed converting a Caffe model file to a TensorRT model file, as shown in this tutorial:

in short something like:

MODEL_PROTOTXT = './data/mnist/mnist.prototxt'
CAFFE_MODEL = './data/mnist/mnist.caffemodel'

engine = trt.utils.caffe_to_trt_engine(G_LOGGER,
                                       1 << 20,

trt.utils.write_engine_to_file(SAVE_PATH, engine.serialize())

However it seems that the whole utils package is missing in TensorRT 5.0.

I’m seeking a tutorial explaining how to reproduce this functionality:

input: .prototxt and .caffemodel files
output: .engine file



It is possible to continue using the legacy API by importing the TensorRT legacy package. Simply replace import tensorrt as trt (for example) with import tensorrt.legacy as trt in your scripts.

in this case, tensorrt.legacy.utils

Eventually, legacy support will be dropped, so it is still advisable to migrate to the new API.

yes, that indeed works, thanks.

Will an engine file created by tensorrt.legacy give same performance as in TensorRT5.0, and specifically for Turing architecture?

I still can’t find examples in new API Doc on how to do this, and would appreciate if you can consider adding such examples (how to import model from Caffe)

Hi, I have the same issue.
I am deploying Faster RCNN with Googlenet as feature extractor.
It requires RPROI type layer which is not available in prior versions of tensorrt.
In another thread, you have mentioned that it shall be available in the new issue.
Can you provide details on how to create tensorrt engine from caffe model.


I have the same problem and a hard time finding examples.

Since you got it working, may you can tell me how you extracted/defined the output layers. I tried to take them from the model itself but the conversion didn’t work.


I use TensorRT C++ API to convert trained Caffe models to optimized engines, then wrap the TRT engine inferencing code with Cython. In the end, I’m able to use the optimized model in python.

This approach works for all of TensorRT 3.x, 4.x and 5.x.

Feel free to check out my blog post and code here:


Hi @jkjung13!

Thanks for the answer. I solved it myself, using the instructions on Nvidias website (which where outdated) and another post I found online - the documentation is really poor.

I also found (obviously) your page on my own. It’s an amazing piece of work, thanks for that. It got me started with the network implementation and I can now adjust it to my needs. I found it kind of weird that such a basic function did not exist (or just badly, as bad as Haar), so I am really glad you already solved this problem!