Python API question

I have an engine I have serialized using C++. Now I would like to load it and use it in python. I have found two different methods of doing this: using trt.utils.load_engine or create a runtime object and then call runtime.deserialize_cuda_engine
. What is the difference and which way is more preferred? I was using the first one since it has some python documentation while the second is just a SWIG wrapper (or is it cython?).

And do I even need to create a runtime object in the first place?

And why does the code in the python custom_layers.py example creates an engine, then create a context from the engine and then get the engine from the context?

Correction:

With TRT 4.0, trt.utils.load_engine is the preferred/easist method.

Thansk. And could you please also clarify the engine -> context -> engine and if I need to create a runtime object?

Hello,

I assume you are referencing this example? https://docs.nvidia.com/deeplearning/sdk/tensorrt-api/python_api/workflows/tf_to_tensorrt.html

You wouldn’t have to do any of that. If you already have the saved engine file, you can just call utils.load_engine. The doc you pointed to is just an example of how to do inference with an engine then save/load it.

No need for runtime objects with utils.load_engine.

Thanks

hi iliya,NVES,

any sample c++ code or github link for using Uff file in jetson tx2 .

i have converted my trained model to UFF file and copied it to jetson ,but not sure how to use it with tensorrt as i have very little idea about c++ , so not able to get sense from jetson inference github .

Hi @chandrakanta.
Not sure if you were referring to this, but that’s the only example I know of https://github.com/NVIDIA-Jetson/tf_trt_models/

hi @iliya,

yes ,i did checked on this as well, but this is for the pretrained models avialable.

my requirement is , if i have trained model how i can leverage power of tensorrt on jetson platform .

i can use tensorrt on my host system .but i cant do the same on jetson for my trained model and if i run using python its very slow .

from many forum , i came across i understood that i have to change the trained model to UFF format , then copy to jetson and write a c++ wrapper to convert this to optimized tensorrt engine and use it for inference.

correct me if my understanding is wrong .

i am just new to jetson and trying to do small project .

thanks

Hello,

Here is a C++ approach for Jetson TX2: exporting a UFF model and creating TensorRT engine.
tensorrt/samples/sampleUffMNIST/sampleUffMNIST.cpp