Will there be any advantage in inference speed if I use python to execute the inference when the .plan was created with C++?

Hi

When creating a .plan file using C++, either:

  • uff to plan OR
  • t2onxx to plan

Will there be any advantage in inference speed if I use python to execute the inference?

This is to summarize:

  1. c++: UFF/tf2onnx to plan
  2. python: running inference

Thank you

Hi,

This depends on your use case.

For inference, both C++ and python interface links to the same TensorRT library, which is implemented with CUDA.
So the performance is similar.

Some user prefer python for its rich preprocessing modules.
However, if you are going to write some CUDA code, C++ will be a better choice.

Thanks.

Thank you for the quick answer. I have some other doubts:

  1. Is the Python API a wrapper of the C++ API?

  2. In this GIT: NVIDIA-AI-IOT/tf_trt_image_classification I have noticed that everything has been done using python except for the generation of the .plan file (UFF → plan); which has been done with C++.

    So, why not just do everything with python? What is the advantage of creating the plan file in C++ (performance, or just because of the CUDA code ?

  3. In the NVIDIA TensorRT documentation says the following:

Does that mean that using C++ will be faster when doing inference?

Hi,

1. YES.

2. TensorRT python support is added after the sample release.

3. Python might be slightly slower for the wrapper overhead.

Thanks.

1 Like