Comparing Embedding Output Between Pytorch and TensorRT

Description

Questions:

  1. How should I compare the embedding output of a loaded pytorch torchscript model with the embedding output from a TensorRT to evaluate my implementation?
  2. How close should I expect the embedding outputs to be?

I went through the process of converting a pytorch model loaded from torchscript and then serialized it into a a tensorrt engine plan using the python api with float 32 precision. I would like to compare the outputs to make sure my implementation is correct. I understand that the output is expected to be different based on the instructional video for converting tensorflow to tensorrt for classifications. I went through the process of running the same image data in both models and comparing the output using cosine distance which for my output converted to a 1D array is = 0.0013010502 which is 1 - cosine similarity. Is this a good method for comparing the inferences between the models to verify I implemented the conversion correctly and my tensorrt engine output is what I should expect? How close should I expect my output to be?

Environment

TensorRT Version: 10.8.0.43
GPU Type: A5000
Nvidia Driver Version: 565.57.01
CUDA Version: 12.7
CUDNN Version: n/a
Operating System + Version: Ubuntu 22.04.5 LTS
Python Version (if applicable): Python 3.12.8
TensorFlow Version (if applicable):
PyTorch Version (if applicable): 2.5.1+cu124
Baremetal or Container (if container which image + tag): none

Relevant Files

None, as is not a bug.

Steps To Reproduce

Please note question.