TensorRT .trt custom model inference

Hello. I’ve created a Unet model on pytorch and transferred it to .trt format.
I did this way, .pth → .onnx → .trt.

Now I want to run inference on the GPU, but i didn’t find any samples
how to use .trt format using C++ API. Is there any examples how to run it? Or I should run .onnx ?

Environment

TensorRT Version: 7
GPU Type: GeForce GTX 166…
Nvidia Driver Version: 440.64
CUDA Version: 10.2
CUDNN Version:
Operating System + Version: Ubuntu
Python Version (if applicable): 3.6
TensorFlow Version (if applicable):
PyTorch Version (if applicable): 1.4
Baremetal or Container (if container which image + tag):

Please refer to below link:

https://github.com/NVIDIA/TensorRT/blob/master/samples/opensource/sampleOnnxMNIST/sampleOnnxMNIST.cpp#L196

Thanks

Is it possible to convert .pth → .onnx → .trt. → _trt.pth? so we can use model_trt = TRTModule()
Can you mention a method?