Erroneous mismatch in TensorRT serializations on JetPack 6 (L4T 36.4.0)

We have been running inference on models serialized by TensorRT on JP5 where a container on a Seeed Studio Jetson Orin NX Dev kit was used to run the TensorRT Quantization/Serialization process. We then deploy the model inside a container which runs on a Syslogic A4NX (also an Orin NX). However, replicating this procedure with JP6 on both the builder and the runner, we get the following error when we attempt to actually do inference on the runner:

[12/03/2025-14:16:36] [TRT] [E] IRuntime::deserializeCudaEngine: Error Code 1: Serialization (Serialization assertion safeVersionRead == kSAFE_SERIALIZATION_VERSION failed.Version tag does not match. Note: Current Version: 0, Serialized Engine Version: 239)
ERROR ❌ TensorRT model exported with a different version than 10.3.0

Here are the versions of libraries on the Builder:
Jetpack : 6
Base Image: nvcr.io/nvidia/l4t-cuda:12.6.11-runtime
Additional installed libraries in the container: libnvinfer10 libnvonnxparsers10 libnvinfer-plugin10 libcudnn9-cuda-12

relevant Python libraries : Python 3.12, TensorRT == 10.3.0

Here are the version on which the model is inferred:
Jetpack: 6
Base Image: dustynv/l4t-pytorch:r36.4.0
Relevant Python libraries: Python 3.12.8 , TensorRT == 10.3.0

Any reference of this error online tells us there is a TensorRT version mismatch. But we are using the same package in both the machines as seen above.
Our thoughts:
The serialized engine version shows 239 which means it is the correct GPU (Orin) but the current shows 0 so maybe some libraries are not picked up properly.

We would really appreciate if you had any ideas on why this would be showing up as it does. If you do need any further information we are happy to provide those too.

*** Please note that this reply is generated by LLM automatically ***

The error you’re encountering is likely due to a version mismatch between the TensorRT versions used for serialization and inference. Although you’re using the same package in both machines, there might be some underlying library or dependency issues.

Here are a few potential reasons for this error:

  1. TensorRT version mismatch: Although you’re using the same TensorRT package (8.6.13), there might be some internal version mismatches. The serialized engine version shows 239, which is correct for the Orin GPU, but the current version shows 0, indicating a potential issue.
  2. Library or dependency issues: The error might be caused by a mismatch in the libraries or dependencies used during serialization and inference. For example, the libnvinfer10, libnvonnxparsers10, and libnvinferplugin10 libraries are installed in the builder container, but it’s unclear if they’re properly linked or loaded during inference.
  3. GPU architecture mismatch: Although you’re using the same Orin NX GPU in both machines, there might be some differences in the GPU architecture or configuration that could cause the error.

To troubleshoot this issue, you can try the following:

  1. Verify library versions: Double-check the versions of the libraries installed in both containers, including libnvinfer10, libnvonnxparsers10, and libnvinferplugin10. Ensure that they match the versions required by TensorRT 8.6.13.
  2. Check GPU architecture: Verify that the GPU architecture and configuration are identical in both machines. You can use tools like nvidia-smi to check the GPU information.
  3. Use the same base image: Try using the same base image (e.g., nvcr.io/nvidia/l4t-cuda:11.6.1-runtime) in both containers to ensure that the underlying libraries and dependencies are identical.
  4. Rebuild the serialized engine: Rebuild the serialized engine using the same TensorRT version and libraries as the inference container. This will ensure that the engine is compatible with the inference environment.
  5. Check for any updates: Ensure that you’re using the latest versions of TensorRT, CUDA, and other dependencies. Sometimes, updating to the latest versions can resolve compatibility issues.

If none of these suggestions resolve the issue, please provide more information about your environment, including the exact commands used for serialization and inference, as well as any relevant logs or error messages.

*** Please give the thumbs up if you get answers you like. Or provide feedback to help us improve the answer. ***

Hi,

Could you try to deploy the model on the serialized container to validate if the engine can work first?
In the inference container, do you install TensorRT manually, or is it pre-installed inside the dustynv/l4t-pytorch:r36.4.0?

Thanks.

Could you try to deploy the model on the serialized container to validate if the engine can work first?

In the build environment, we run the model after building it to verify basic functionality, and there it works.

In the inference container, do you install TensorRT manually, or is it pre-installed inside the dustynv/l4t-pytorch:r36.4.0?

No, we compile a wheel for it ourselves (since TRT 10.3.0 doesn’t have binaries for Python 3.12 by default) and install it from our private PyPI using uv.

Hi,

Could you share more details about how you compile it?
Thanks.

Could you share more details about how you compile it?

There’s not much to it, really - we run the compilation inside a Docker container on a Jetson Orin NX device running in our build farm (a Seeed Studios ReComputer J4012, if it matters). The compilation basically works by cloning the TRT repo (git clone -–recursive --single-branch --branch “$TENSORRT_VERSION” ``https://github.com/NVIDIA/TensorRT.git), installing a standalone Python using uv (to guarantee the version), creating a venv, then running the build.sh script inside the repo.

We then use the same compiled TRT wheel in both the engine builder (which is another container running on the same ReComputer), to test the generated engine, as well as in the inference environment.

Hi,

Are you able to share the package and how to set up the compiled and runtime container with us?
We need to reproduce this locally to gather more information.

Thanks.

I sent the wheel and relevant Dockerfiles via private message.

In the meantime, a colleague of mine figured it out. Turns out that the libnvinfer in the image wasn’t playing nice - by doing an apt-get install libnvinfer10 rather than relying on the library version that came with the image, it started working.

For the record, the apt-get install gave us libnvinfer 10.7.0, vs. the 10.4.0 that shipped with the image.

There is no update from you for a period, assuming this is not an issue anymore.
Hence, we are closing this topic. If need further support, please open a new one.
Thanks
~0106

Hi,

So does it work if you force the libnvinfer to be the version you expected?

Thanks.