ONNX/TensorRT INT64 Clamping. Why?

Models exist which use INT64 values.
Inference with those models works on my 4090 GPU.
Hardware computation on a GPU or CPU involving INT64 values don’t care if the code invoking the computations is done via non-compiled python code or highly optimized TRT compiled code.
“Why” doesn’t TRT support INT64?

Also, if there was a good reason, wouldn’t a conversion to float32 better represent large magnitude INT64 values. I’m not sure if the “might” be a contributing factor in the poor quality of SD inferenced images when TRT is used.

Request you to share the ONNX model and the script if not shared already so that we can assist you better.
Alongside you can try few things:

  1. validating your model with the below snippet


import sys
import onnx
filename = yourONNXmodel
model = onnx.load(filename)
2) Try running your model with trtexec command.

In case you are still facing issue, request you to share the trtexec “”–verbose"" log for further debugging

It is well known that tensorrt doesn’t support int64. Are you doubting that non-tensorrt models sometimes use int64? Do you really want me to post a 1.7GB unet_fp16.onnx model converted from huggingface runwayml/stable-diffusion-v1-5?

I’m not trying to track down a bug for which providing a test case would be appropriate. This situation is obvious. TRT does NOT support int64. Why? My NVidia hardware doesn’t have a problem doing inference with non-tensorrt engine models which happen to have int64 values in them.

Also, onnx-checker didn’t output anything as if there was no problem with the model.


In future major release versions, INT64 support will be added. Please stay tuned for the update.

Thank you.