Inference Discrepancy Between TensorRT 10.13.2 (Thor) and 8.6.2 (Orin)

Hi NVIDIA Team,

I’m observing inconsistent inference results for the same ONNX model on two platforms:

  • Thor (TensorRT 10.13.2)

  • Orin (TensorRT 8.6.2)

Key Details:

  1. The input tensor is identical across both platforms (shape, dtype, values).

  2. ONNX Runtime inference matches Orin’s output but differs from Thor’s.

  3. Model link for repro: https://drive.google.com/file/d/1rNg2gJrOMgkCLsQTXj9bbensLPP5XpSn/view?usp=sharing

Request:
Could you help investigate what could cause this discrepancy?How can I systematically troubleshoot this?

Thank you for your expertise!

Best regards,

Hi,

Could you share more details about this issue?
Is the result generated from Thor correct?

The model might run with a different algorithm between Thor and Orin, so the output won’t be identical.

Thanks.

Hi,

Thank you for your prompt reply.

I’d like to clarify that the output of this model generated on Thor is indeed incorrect. To ensure a fair comparison, I’ve verified that the inference code running on both Thor and Orin is identical. Additionally, the TensorRT engines used on both platforms were generated separately from the same ONNX model, using the exact same conversion command on each device.

To further isolate the issue, I saved the input tensor (just before model inference) as a binary file from both platforms. The inputs are nearly identical.

Moreover, I implemented a local ONNX-based inference script using the same ONNX model and fed it the saved input tensor. The result from this ONNX reference implementation matches the output from Orin and is functionally correct, whereas the output from Thor deviates significantly and is incorrect.

For your convenience, I’m happy to provide:

1. The ONNX model,

2. A representative input binary file (with matching dimensions, randomly initialized for data privacy),

3. And the local ONNX inference script I used for validation.

https://drive.google.com/file/d/16jWhq8y68iYhvKGtqAvI0YPNu8aL-p05/view?usp=sharing

These should allow you to reproduce the discrepancy on your end. If you need any additional materials or information to help reproduce or diagnose the issue, please don’t hesitate to let me know—I’d be glad to assist.

Thank you again for your support—I look forward to your insights.

Best regards.

Hi,

Thanks for providing more details about this issue.

We try to reproduce the accuracy drop with polygraphy on the ONNX model attached (ubt_20251229.onnx).

However, we found the output is nan for both ONNXRuntime and TensorRT backends.
Could you help us check why the output is not valid?

$ polygraphy run ubt_20251229.onnx --onnxrt --trt --verbose
...
[I] Accuracy Comparison | onnxrt-runner-N0-01/05/26-05:25:14 vs. trt-runner-N0-01/05/26-05:25:14
[I]     Comparing Output: 'scores' (dtype=float32, shape=(1, 1)) with 'scores' (dtype=float32, shape=(1, 1))
[I]         Tolerance: [abs=1e-05, rel=1e-05] | Checking elemwise error
[I]         onnxrt-runner-N0-01/05/26-05:25:14: scores | Stats: mean=nan, std-dev=nan, var=nan, median=nan, min=nan at (0, 0), max=nan at (0, 0), avg-magnitude=nan, p90=nan, p95=nan, p99=nan
[I]             ---- Values ----
                    [[nan]]
[V]             Could not generate histogram. Note: Error was: supplied range of [nan, nan] is not finite
[I]             
[I]         trt-runner-N0-01/05/26-05:25:14: scores | Stats: mean=nan, std-dev=nan, var=nan, median=nan, min=nan at (0, 0), max=nan at (0, 0), avg-magnitude=nan, p90=nan, p95=nan, p99=nan
[I]             ---- Values ----
                    [[nan]]
[V]             Could not generate histogram. Note: Error was: supplied range of [nan, nan] is not finite
[I]             
[I]         Error Metrics: scores
[I]             Minimum Required Tolerance: elemwise error | [abs=nan] OR [rel=nan] (requirements may be lower if both abs/rel tolerances are set)
[I]             Absolute Difference | Stats: mean=nan, std-dev=nan, var=nan, median=nan, min=nan at (0, 0), max=nan at (0, 0), avg-magnitude=nan, p90=nan, p95=nan, p99=nan
[I]                 ---- Values ----
                        [[nan]]
[V]                 Could not generate histogram. Note: Error was: autodetected range of [nan, nan] is not finite
[I]                 
[I]             Relative Difference | Stats: mean=nan, std-dev=nan, var=nan, median=nan, min=nan at (0, 0), max=nan at (0, 0), avg-magnitude=nan, p90=nan, p95=nan, p99=nan
[I]                 ---- Values ----
                        [[nan]]
[V]                 Could not generate histogram. Note: Error was: autodetected range of [nan, nan] is not finite
[I]                 
[E]         FAILED | Output: 'scores' | Difference exceeds tolerance (rel=1e-05, abs=1e-05)
[E]     FAILED | Mismatched outputs: ['scores']
[E] Accuracy Summary | onnxrt-runner-N0-01/05/26-05:25:14 vs. trt-runner-N0-01/05/26-05:25:14 | Passed: 0/1 iterations | Pass Rate: 0.0%
[E] FAILED | Runtime: 12.508s | Command: /home/nvidia/topic_356086/env/bin/polygraphy run ubt_20251229.onnx --onnxrt --trt --verbose

The tool helps our internal team to check the issue.
It can be installed via the following command:

$ pip3 install polygraphy

Thanks.

Hi,

With the source attached on Jan 4, we can reproduce this issue internally.
We need to check this issue with our internal team and provide more information to you later.

Thanks.

Hi,

Thank you for confirming the issue and providing an update. We appreciate your team’s efforts in reproducing and investigating the problem.

Please feel free to reach out if any additional information or collaboration is needed from our side. We look forward to your further insights and resolution steps.

Thanks.

Hi,

Thanks a lot for your patience.

Our internal team needs more time for this issue.
We will keep you updated on any progress.

Thanks.

Hi,

Thank you for the update — I really appreciate your team’s continued attention to this matter.

I understand that these things can take time, and I truly value the effort you’re putting in. Please do keep me posted as things progress. I’m looking forward to hearing more updates from you soon.

Thanks.

Hi,

I hope you’re doing well. I’d like to check if there’s any update on this issue. Could you also confirm whether your team was able to reproduce the problem—specifically, that the engine’s inference results indeed don’t align with those from ONNX?

Thank you!