TensorRT 8.6 vs TensorRT 10.3 Accuracy drop issue on FaceXFormer

Description

I’m in the process of converting onnx model to trt model in two different environments. (using facexformer model: GitHub - Kartik-3004/facexformer: [ ICCV 2025 ] FaceXFormer: A Unified Transformer for Facial Analysis )

Environment1: Ubuntu 20.04 with TensorRT 8.6.0
Environment 2: Ubuntu 22.04 with TensorRT 10.3.0

With environment 1, both onnx model and tensorrt model work well, yielding nearly same results.
However, with environment 2, onnx model and tensorrt model outputs completely different results.
I found that the layer causing problem is the final fc layer (in the prediction heads in face decoder head), not the FP32 or other options.
But still I do not understand why the output of the fc layer is different even though I used the same onnx model.

Has anyone encountered similar issues or could provide guidance on how to handle accuracy drop issue in TensorRT 10.3?

Environment

TensorRT Version: 10.3.0.30
NVIDIA GPU: NVIDIA Jetson Orin AGX (64GB ram), aarch64
Nvidia Driver Version: Jetpack 6.1
CUDA Version: 12.6.68
CUDNN Version: 9.3.0.75
Operating System + Version: Ubuntu 22.04
Python Version (if applicable): 3.10.12

Relevant Files

Please attach or include links to any models, data, files, or scripts necessary to reproduce your issue. (Github repo, Google Drive, Dropbox, etc.)

FaceXFormer github: GitHub - Kartik-3004/facexformer: [ ICCV 2025 ] FaceXFormer: A Unified Transformer for Facial Analysis

facexformer/network/models/facexformer.py

class MLP(nn.Module):
    def __init__(
        self,
        input_dim: int,
        hidden_dim: int,
        output_dim: int,
        num_layers: int,
        sigmoid_output: bool = False,
    ) -> None:
        super().__init__()
        self.num_layers = num_layers
        h = [hidden_dim] * (num_layers - 1)
        self.layers = nn.ModuleList(
            nn.Linear(n, k) for n, k in zip([input_dim] + h, h + [output_dim])
        )  ## output of the self.layers[2] is not correct here.
        self.sigmoid_output = sigmoid_output

    def forward(self, x):
        for i, layer in enumerate(self.layers):
            x = F.relu(layer(x)) if i < self.num_layers - 1 else layer(x)
        if self.sigmoid_output:
            x = F.sigmoid(x)
        return x

Below results are the gender outputs of the self.layers[1] and self.layers[2] of the MLP module above, by onnx and tensorrt models respectively. (tensorrt version: 10.3.0)

gender output of the self.layers[1]

  • array([[0.        , 0.        , 0.05808761, 0.        , 0.        ,
            0.        , 3.477199  , 0.        , 0.        , 0.        ,
            0.27408314, 0.        , 2.2271016 , 2.3485596 , 0.        ,
            0.        , 0.        , 1.8179287 , 1.5138923 , 0.        ,
            1.3196965 , 1.0186559 , 0.        , 0.        , 0.        ,
            0.        , 1.5651121 , 0.03827067, 0.        , 0.59134567,
            0.        , 2.1294482 , 1.86005   , 2.2994459 , 0.        ,
            0.        , 0.25710738, 0.        , 0.        , 0.        ,
            0.        , 0.84698665, 0.21039551, 0.        , 0.        ,
            0.        , 0.        , 0.        , 0.45060235, 0.        ,
            2.8804781 , 0.        , 0.        , 0.        , 0.        ,
            0.48541653, 0.        , 0.        , 0.        , 0.        ,
            0.27044433, 0.        , 0.30780125, 0.        , 1.4533952 ,
            0.        , 0.        , 0.        , 0.        , 0.        ,
            0.        , 0.        , 0.        , 0.        , 0.72238696,
            0.64325356, 0.        , 0.        , 0.        , 0.        ,
            0.        , 1.2486753 , 0.23857433, 0.        , 0.        ,
            0.        , 0.        , 0.        , 0.        , 0.        ,
            1.8387429 , 1.3987124 , 0.        , 0.        , 0.        ,
            0.        , 0.        , 0.47385132, 0.67286277, 0.        ,
            0.        , 0.        , 0.        , 0.6301911 , 0.        ,
            0.        , 1.0899894 , 0.6948872 , 2.4827533 , 2.4057944 ,
            0.        , 0.        , 0.        , 0.        , 0.        ,
            1.4392943 , 2.8943033 , 0.        , 0.        , 1.707997  ,
            2.3642976 , 0.        , 1.8984444 , 0.31208912, 0.        ,
            0.        , 1.206218  , 0.952178  , 0.52337813, 0.9476676 ,
            0.        , 0.        , 3.434001  , 0.        , 0.8404969 ,
            2.947358  , 0.7576987 , 0.        , 0.        , 0.        ,
            0.        , 0.        , 0.        , 2.3338456 , 1.1715947 ,
            0.        , 0.27870122, 0.        , 0.        , 0.        ,
            1.3402828 , 0.        , 0.        , 0.98650426, 0.        ,
            0.37729126, 0.        , 0.        , 0.        , 0.        ,
            2.497146  , 0.40936956, 1.3655293 , 0.        , 0.        ,
            1.4468944 , 2.6951365 , 1.1120243 , 0.        , 0.310199  ,
            1.6930225 , 0.07920389, 0.        , 3.4387784 , 0.        ,
            3.312109  , 0.        , 0.        , 3.8581517 , 0.        ,
            0.        , 1.8305678 , 0.59664637, 1.2491843 , 1.2303423 ,
            1.1017985 , 0.        , 3.035324  , 0.        , 0.7382026 ,
            0.        , 0.        , 3.8139768 , 0.        , 0.        ,
            1.2108988 , 0.        , 0.7563885 , 0.        , 0.        ,
            0.        , 1.1610062 , 0.12705252, 0.        , 1.1377006 ,
            0.        , 1.0954049 , 0.        , 0.        , 1.6825731 ,
            0.        , 0.        , 0.        , 0.        , 0.        ,
            0.46103236, 0.        , 1.0125588 , 0.88917285, 0.        ,
            0.75880516, 0.        , 0.3064087 , 0.        , 0.        ,
            0.        , 0.87988764, 1.692071  , 0.        , 1.2867929 ,
            0.        , 1.5831435 , 0.        , 0.        , 0.        ,
            1.3537407 , 0.5190538 , 0.        , 0.        , 0.        ,
            0.41977894, 1.2256019 , 0.        , 0.9800967 , 0.        ,
            3.5048394 , 0.        , 0.        , 1.323862  , 0.        ,
            1.0765    , 0.        , 0.        , 1.7168648 , 0.        ,
            0.        ]], dtype=float32)  # when run by onnx model
    
    
  • array([0.        , 0.        , 0.05793402, 0.        , 0.        ,
           0.        , 3.4770603 , 0.        , 0.        , 0.        ,
           0.27417478, 0.        , 2.227072  , 2.3486083 , 0.        ,
           0.        , 0.        , 1.8178328 , 1.513855  , 0.        ,
           1.3198341 , 1.018712  , 0.        , 0.        , 0.        ,
           0.        , 1.5651159 , 0.0386155 , 0.        , 0.59137297,
           0.        , 2.1294098 , 1.8600523 , 2.2993505 , 0.        ,
           0.        , 0.25721002, 0.        , 0.        , 0.        ,
           0.        , 0.8471938 , 0.21050397, 0.        , 0.        ,
           0.        , 0.        , 0.        , 0.45072427, 0.        ,
           2.8805625 , 0.        , 0.        , 0.        , 0.        ,
           0.48538512, 0.        , 0.        , 0.        , 0.        ,
           0.27047098, 0.        , 0.30784667, 0.        , 1.4533136 ,
           0.        , 0.        , 0.        , 0.        , 0.        ,
           0.        , 0.        , 0.        , 0.        , 0.7225782 ,
           0.64328545, 0.        , 0.        , 0.        , 0.        ,
           0.        , 1.2486362 , 0.2387146 , 0.        , 0.        ,
           0.        , 0.        , 0.        , 0.        , 0.        ,
           1.8386166 , 1.3987378 , 0.        , 0.        , 0.        ,
           0.        , 0.        , 0.4737481 , 0.67282987, 0.        ,
           0.        , 0.        , 0.        , 0.6303336 , 0.        ,
           0.        , 1.0900357 , 0.69507843, 2.4827523 , 2.4058163 ,
           0.        , 0.        , 0.        , 0.        , 0.        ,
           1.4393821 , 2.8943098 , 0.        , 0.        , 1.7081137 ,
           2.3643477 , 0.        , 1.8984545 , 0.31208682, 0.        ,
           0.        , 1.2062762 , 0.9523665 , 0.5234538 , 0.94775575,
           0.        , 0.        , 3.4340189 , 0.        , 0.8404217 ,
           2.9474654 , 0.7575724 , 0.        , 0.        , 0.        ,
           0.        , 0.        , 0.        , 2.3339655 , 1.1714681 ,
           0.        , 0.278761  , 0.        , 0.        , 0.        ,
           1.3402406 , 0.        , 0.        , 0.9866192 , 0.        ,
           0.37722817, 0.        , 0.        , 0.        , 0.        ,
           2.4971704 , 0.40931424, 1.3654728 , 0.        , 0.        ,
           1.4469875 , 2.6954005 , 1.1120404 , 0.        , 0.31007853,
           1.693     , 0.07938661, 0.        , 3.4386125 , 0.        ,
           3.3119926 , 0.        , 0.        , 3.8579962 , 0.        ,
           0.        , 1.830519  , 0.5966802 , 1.2493219 , 1.230397  ,
           1.1018955 , 0.        , 3.0351832 , 0.        , 0.73843324,
           0.        , 0.        , 3.8139956 , 0.        , 0.        ,
           1.2108365 , 0.        , 0.75647795, 0.        , 0.        ,
           0.        , 1.1611803 , 0.1270166 , 0.        , 1.1378586 ,
           0.        , 1.0955011 , 0.        , 0.        , 1.6825553 ,
           0.        , 0.        , 0.        , 0.        , 0.        ,
           0.46100935, 0.        , 1.012655  , 0.88912433, 0.        ,
           0.75869244, 0.        , 0.3065601 , 0.        , 0.        ,
           0.        , 0.87993157, 1.6919793 , 0.        , 1.2867244 ,
           0.        , 1.5831835 , 0.        , 0.        , 0.        ,
           1.353816  , 0.5193663 , 0.        , 0.        , 0.        ,
           0.41982925, 1.2256112 , 0.        , 0.9800905 , 0.        ,
           3.5046687 , 0.        , 0.        , 1.3237199 , 0.        ,
           1.0765522 , 0.        , 0.        , 1.7168034 , 0.        ,
           0.        ], dtype=float32)  # when run by engine model
    

gender output of the self.layers[2]

  • array([[-4.4207296,  4.519924 ]], dtype=float32)  # when run by onnx model
    
  • array([ 0.94579434, -0.46932346], dtype=float32)  # when run by engine model
    

So you can see the outputs of the self.layers[1] from onnx model and engine model are nearly same, but the outputs of the self.layers[2] are not.

The issue you’re experiencing with the FaceXFormer model is likely due to differences in the behavior of the Fully Connected (FC) layer between TensorRT 8.6 and 10.3.

Here are a few potential reasons for the discrepancy:

  1. **Layer fusion**: TensorRT 10.3 has improved layer fusion, which might affect the behavior of the FC layer. You can try disabling layer fusion using the `setFlag` method with the `kDISABLE_LAYER_FUSION` flag to see if it makes a difference.
  2. **Kernel selection**: TensorRT 10.3 might be selecting a different kernel for the FC layer, which could lead to differences in the output. You can try setting the `kSTRICT_TYPES` flag using the `setFlag` method to enforce strict type checking and see if it resolves the issue.
  3. **Numerical precision**: Although you mentioned that the issue is not related to FP32 or other options, it’s still possible that numerical precision differences between the two environments are causing the discrepancy. You can try setting the `kFP16` flag using the `setFlag` method to force the FC layer to use FP16 precision and see if it improves the accuracy.
  4. **Weight quantization**: If the ONNX model uses weight quantization, it’s possible that the quantization scheme is not being applied correctly in TensorRT 10.3. You can try setting the `kQUANTIZE_WEIGHTS` flag using the `setFlag` method to enable weight quantization and see if it resolves the issue.

To troubleshoot the issue, you can try the following steps:

  1. **Compare the layer output**: Use the `getOutput` method to retrieve the output of the FC layer in both environments and compare the values to identify any differences.
  2. **Check the layer configuration**: Use the `getLayer` method to retrieve the FC layer configuration in both environments and compare the settings to ensure they are identical.
  3. **Enable debug logging**: Set the `kDEBUG_LOG` flag using the `setFlag` method to enable debug logging and see if it provides any insights into the issue.

Here’s an example code snippet that demonstrates how to set the flags mentioned above:
```python
import tensorrt as trt

Create a TensorRT builder

builder = trt.Builder(trt.Logger())

Set the flags

builder.set_flag(trt.BuilderFlag.kDISABLE_LAYER_FUSION)
builder.set_flag(trt.BuilderFlag.kSTRICT_TYPES)
builder.set_flag(trt.BuilderFlag.kFP16)
builder.set_flag(trt.BuilderFlag.kQUANTIZE_WEIGHTS)

Create a TensorRT engine

engine = builder.build_engine(network)
```
If none of these suggestions resolve the issue, please provide more details about your model, environment, and code, and I’ll do my best to help you troubleshoot the problem.

Thank you so much for your reply.

I tried all of setflags you suggested to check if they affect the behavior of the FC layer,
but there were still some issues.

Layer fusion

  • config.set_flag(trt.BuilderFlag.kDISABLE_LAYER_FUSION)
    • → AttributeError: type object ‘tensorrt.tensorrt.BuilderFlag’ has no attribute ‘kDISABLE_LAYER_FUSION’
  • changed ‘kDISABLE_LAYER_FUSION’ to ‘DISABLE_LAYER_FUSION’, but got same error

Kernel selection

  • config.set_flag(trt.BuilderFlag.kSTRICT_TYPES)
    • → AttributeError: type object ‘tensorrt.tensorrt.BuilderFlag’ has no attribute ‘kSTRICT_TYPES’
  • changed ‘kSTRICT_TYPES’ to ‘STRICT_TYPES’, but got same error.

Numerical precision

  • config.set_flag(trt.BuilderFlag.kFP16)
    • → AttributeError: type object ‘tensorrt.tensorrt.BuilderFlag’ has no attribute ‘kFP16’. Did you mean: ‘FP16’?
  • changed ‘kFP16’ to ‘FP16’, and got gender result below, which is nearly same with previous result.
    • array([0.94433594, -0.46850586], dtype=float32)

Weight quantization

  • config.set_flag(trt.BuilderFlag.kQUANTIZE_WEIGHTS)
    • → AttributeError: type object ‘tensorrt.tensorrt.BuilderFlag’ has no attribute 'kQUANTIZE_WEIGHTS’
  • Changed ‘kQUANTIZE_WEIGHTS’ to ‘QUANTIZE_WEIGHTS’, but got same error.

Also, the builderflags you mentioned do not exist in the official document (IBuilderConfig — NVIDIA TensorRT Standard Python API Documentation 10.3.0 documentation).

It would be very appreciate if you help me troubleshoot this issue.