TensorRT .trt custom model inference

Hello. I’ve created a Unet model on pytorch and transferred it to .trt format.
I did this way, .pth -> .onnx -> .trt.

Now I want to run inference on the GPU, but i didn’t find any samples
how to use .trt format using C++ API. Is there any examples how to run it? Or I should run .onnx ?

Environment

TensorRT Version: 7
GPU Type: GeForce GTX 166…
Nvidia Driver Version: 440.64
CUDA Version: 10.2
CUDNN Version:
Operating System + Version: Ubuntu
Python Version (if applicable): 3.6
TensorFlow Version (if applicable):
PyTorch Version (if applicable): 1.4
Baremetal or Container (if container which image + tag):

Please refer to below link:
https://docs.nvidia.com/deeplearning/tensorrt/archives/tensorrt-700/tensorrt-developer-guide/index.html#perform_inference_c

Thanks

Is it possible to convert .pth -> .onnx -> .trt. -> _trt.pth? so we can use model_trt = TRTModule()
Can you mention a method?