Description
I converted the Dinov2 (embeddings - vits14) torch model to onnx and then to TensorRT.
I extracted the emb from the following cropped images:
- drink image (regular position)
- drink image (rotated position)
- chips image (regular position)
- chips image (rotated position)
The distance between the emb of drink Vs drink rotated is ~0.4 (the same for chips Vs chips rotated)
The distance between the emb of drink Vs chips is ~ 0.75
Which make sense because I’m expecting to see higher distance between different products and lower distance between the same products.
The above test was based on torch and onnx Dinov2 models.
But when trying to use the TensorRt model I’m getting the following distances:
Distance between drink and chips is ~0.53
Distance between drink and drink rotated is ~0.42
Which is too close. I’m expecting a better distance separation as Torch and Onnx model providing.
Any ideas ? Maybe the TensorRT conversion was done incorrectly ?
Environment
TensorRT Version: 10.3.0
GPU Type: Nvidia
Nvidia Driver Version: 540.4.0
CUDA Version: 12.6
Operating System + Version: Jetson orin nano - developer kit, jetpack 6.0
Python Version (if applicable): 3.10.12
Conversion record:
- Torch to onnx conversion:
import torch
Wrap the model to exclude the masks input during export
class DINOEmbeddingExtractor(torch.nn.Module):
def init(self, model):
super(DINOEmbeddingExtractor, self).init()
self.model = model
def forward(self, x):
# Forward only the required input and ignore any mask inputs
return self.model(x)
Load your DINOv2 model
model = torch.hub.load(“facebookresearch/dinov2”, “dinov2_vits14”, source=“github”)
model.eval()
Wrap the model
embedding_extractor = DINOEmbeddingExtractor(model)
Define a dummy input for the ONNX export
dummy_input = torch.randn(1, 3, 224, 224)
Export the wrapped model to ONNX without the mask input
torch.onnx.export(
embedding_extractor,
dummy_input,
“dinov2_vit_s14_no_masks.onnx”,
input_names=[“input”],
output_names=[“output”],
dynamic_axes={“input”: {0: “batch_size”}, “output”: {0: “batch_size”}},
opset_version=17
)
print(“Model successfully exported to dinov2_vit_s14_no_masks.onnx”)
- Onnx to TensorRt conversion:
/usr/src/tensorrt/bin/trtexec --onnx=/home/shraga/workspace/projects/kanduai-express-checkout/inference_testing_and_convertions/embeddings/dinov2_vit_s14_no_masks.onnx --saveEngine=/home/shraga/workspace/projects/kanduai-express-checkout/inference_testing_and_convertions/embeddings/dinov2_vit_s14_no_masks_fp32.trt