Description
I am trying to integrate the large model of Segment Anything Model (SAM: ViT-H version) with TensorRT in C++. Although I successfully serialized the encoder, I am stuck on the decoder serialization.
I generated the ONNX model using the code provided by Meta in the Segment Anything Model repository. However, when creating/serializing the decoder, I get the following error:
ITensor::getDimensions: Error Code 4: API Usage Error (Network input orig_im_size is a shape tensor and must have type Int32 or Int64. Type is Float.)
To fix this, I changed the tensor orig_im_size
type from float
to int32
in the code:
dummy_inputs = { "orig_im_size": torch.tensor([1500, 2250], dtype=torch.int32) }
However, when using the newly generated ONNX file, I get another error:
kERROR: IBuilder::buildSerializedNetwork: Error Code 4: API Usage Error (Optimization profile 0 is missing values for shape input tensor orig_im_size.)
Even though it does not make much sense, I also tried setting dynamic dimensions for my static input, but the issue persisted.
What confuses me is that I can generate an engine successfully using trtexec
without any issues. I haven’t been able to test this engine yet due to problems loading the engine file. However, my main goal is to understand what is wrong and learn how to properly build a plan manually.
I am a beginner with TensorRT and don’t know anyone in my surroundings who has experience with TensorRT.
Thank you for your help.
Environment
TensorRT Version: 10.7.0.23
GPU Type: NVIDIA GeForce RTX 4060
Nvidia Driver Version: 561.17
CUDA Version: 12.6
CUDNN Version: 9.2
Operating System + Version: Windows 11 Pro: 10.0.26100
Python Version (if applicable): 3.11.9 (used to build encoder/decoder onnx files)
PyTorch Version (if applicable): 2.5.1+cu124 (used to build encoder/decoder onnx files)
Relevant Files
I join you all the differents outputs that I got in the same file.
outputs_tensorRT.txt (35.6 KB)
Steps To Reproduce
From linked repository:
Step 0: install requirements.txt
Step 1: Donwload sam vit-h checkpoint: GitHub - facebookresearch/segment-anything: The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Step 2: Position the file in models\pth
Step 3: Generate encoder onnx file with export_encoder_model
script: python export_encoder_model.py --quantize=False --device="cpu"
Step 4: replace your own path to link bin and lib directory of TensorRT in CMakelists.txt
Step 5: Build and run sam target from build/Debug orelse change path in the main