TensorRT inference TopK layer output isn't the same as torch (Training framework) inference TopK layer output (Indices and values)

orong13 · August 11, 2024, 4:01pm

Description

I have a conceptual question about layers like TopK, MaxPool etc. layers.
TopK is a layer which select the most K elements based on their values (Descending or Ascending).
And provide the values and their indices.

TensorRT engine, which built during the optimization process using tactics selection, will have a different tensors inputs values to these layers, compared to the Torch inputs values which will generate different tensors outputs, indices and values.

These kind of differences, which belongs to an intermediate layers, finally cause the final model output to be different.

It is true also if the TopK tensors inputs values gap (TensorRT Vs. Torch) is very very small for example e-10…

How shall I control this TopK tensors inputs gap in order to have the same intermediate tensors outputs in order to get a correct final model output?

Environment

TensorRT Version: 8.6.1.6
GPU Type: RTX 4090 mobile
Nvidia Driver Version: 546.24
CUDA Version: 12.3, V12.3.107
CUDNN Version: 8.9.7
Operating System + Version: Ubuntu 22.04.3 LTS (GNU/Linux 5.15.133.1-microsoft-standard-WSL2 x86_64)
Python Version (if applicable): 3.10.12
TensorFlow Version (if applicable): NA
PyTorch Version (if applicable): 2.2.1+cu121
Baremetal or Container (if container which image + tag): Container - nvcr.io/nvidia/tensorrt:24.01-py3

orong13 · August 26, 2024, 5:38am

Any support will much appreciated.
Thanks

AakankshaS · August 30, 2024, 6:35am

Hi @orong13 ,
Is it just one single TopK layer? If so, I would expect bit-wise matching output. If not, it should be considered a TRT bug.
Please let us know the additional details

Thanks

orong13 · September 14, 2024, 3:27pm

Hello @AakankshaS

Thanks for you answering.

I will try to clarify the issue.

When the input for TopK is bit-wise matched, the TopK operation will generate an identical bit-wise matched output for both TensorRT and Torch.

This can be proved using a sample model like the attached one - TopkModule.
TopkModule.zip (824 Bytes)

The question is what will be the output in case the input isn’t bit-wise matched?
Even if the input will be with very small accuracy gap such as e-10, still the TopK output will not be bit-wise matched.

In the second attached sample model - ConvNormTopkModule.
ConvNormTopkModule.zip (1.3 KB)

You will find that TopK values are not matched between TensorRT and Torch but still the indices outputs are matched.

I believe that in the second case the output indices are equal due to the fact that it is very simple model
For my real model this isn’t the case due to the fact that includes a lot of intermediates layers before the TopK layer which generate not equal output between the TensorRT and Torch.

Please correct me if I am wrong.

regards,

Topic		Replies	Views
Unexpected behavior of TopKLayer General	4	1061	October 12, 2021
Unexpected behavior of TopKLayer TensorRT	0	554	May 16, 2018
TopK - This version of TensorRT only supports input K as an initializer TensorRT tensorrt , tensorflow , onnx	6	4185	May 9, 2022
TopKLayer's two outputs looks the same, and when I mark the second output as network's output,the en... TensorRT	5	795	June 20, 2019
Assertion `!formats.empty()' failed wile using TopK layer TensorRT	4	1650	October 12, 2021
Mismatch between TensorRT conv layer against PyTorch conv layer output TensorRT	1	934	October 31, 2019
ONNX parse problem with TopK TensorRT	13	4284	July 30, 2021
model accuracy penalty with tensorRT on jetson TX2 TensorRT	0	833	June 7, 2019
AGX ORIN onnx-->TensorRT not soupport TopK Jetson AGX Orin tensorrt , python	6	952	February 8, 2023
TensorRT gives different output to original Tensorflow model (conv2d layer conversion) TensorRT	9	3675	August 29, 2018

TensorRT inference TopK layer output isn't the same as torch (Training framework) inference TopK layer output (Indices and values)

Description

Environment

Related topics