TenserRT Issues of Segment Anything Model

Description

I use TensorRT to optimize the encoder and decoder model of Segment Anything Model from facebook on Jetson Orin NX.

The result is worse when I use SAM_H (the largest model) with fp16, but the onnx result is good.

I use the same method on SAM_L(middle model) with fp16. The result is good too.

Because of the memory size I export encoder model to 2 onnx files, split the encoder model in half.

Environment

TensorRT Version: 8.5.2
GPU Type: Jetson Orin NX
CUDA Version: 11.4
CUDNN Version: 8.6.0
Operating System + Version: Jetson

Relevant Files

I use this to export encoder model to onnx.