Yolov11 Triton Inference Server Deployment Problem

ahmetselimdemirel · December 7, 2024, 10:40am

Title: Engine Deserialization Error During Deployment on Triton Inference Server

Description:
I am encountering an issue while deploying a YOLOv11 model on Triton Inference Server. The model was successfully converted to a TensorRT engine and performed inference correctly using the YOLO command-line interface. However, when deploying the model on Triton Inference Server, I receive the following error:

ERROR: 1: [stdArchiveReader.cpp::StdArchiveReader::32] Error Code 1: Serialization (Serialization assertion magicTagRead == kMAGIC_TAG failed.Magic tag does not match)
ERROR: 4: [runtime.cpp::deserializeCudaEngine::66] Error Code 4: Internal Error (Engine deserialization failed.)

Environment Details:

Operating System: Ubuntu 20.04
GPUs: NVIDIA V100 and NVIDIA RTX 3080 (tested on both)
CUDA Version: 11.7
TensorRT Versions Tested: 8.4.3.1, 8.2.0.5
Triton Server Versions Tested: 22.06, 24.11
Pytorch Versions Tested: 1.10.1, 2.0.0
Nvidia Driver Versions Tested: 515.105.01

Steps to Reproduce:

Convert the YOLOv11 model to a TensorRT engine using the following command:
```
$ yolo export model=yolov11.pt format=engine simplify
```
The conversion completes without any issues!
Test the generated engine locally with the following command:
```
$ yolo predict weights=yolov11m.engine source=test.jpg
```
The engine performs inference successfully without any errors!
Deploy the engine on Triton Inference Server. During deployment, the above error is encountered.

Any specific compatibility requirements or configuration steps needed to resolve this issue.

Thank you for your assistance!

Topic		Replies	Views
Error: Cannot deserialize plugin since corresponding IPluginCreator not found in Plugin Registry with Yolov4 TensorRT engine TensorRT	6	3132	April 14, 2023
Inference server failing with YoloV3 Object detection serialized Tensorrt Engine Triton Inference Server - archived tensorrt	0	907	June 25, 2020
Errors while reading TensorRT engine file produced by TAO 5 TAO Toolkit	6	628	September 7, 2023
TensorRT Inference Server rejecting valid trt.engine file generated by TLT Triton Inference Server - archived	0	687	August 16, 2020
Error while inferencing yolov5 tensorrt version on Jetson Xavier NX Jetson Xavier NX tensorrt , yolo	2	2073	February 9, 2022
How to deploy Yolov5 on Nvidia Triton via Jetson Xavier NX Jetson Xavier NX tensorrt , inference-server-triton	2	1327	February 2, 2022
Tensorrt engine file generated by TLT is not acceptable to inference server TensorRT	3	624	August 16, 2020
Yolov5 Engine Inference error TensorRT tensorrt	3	1914	May 6, 2022
Code 4: Internal Error (Engine deserialization failed) TensorRT tensorrt , cuda , ubuntu , yolo	1	3081	June 28, 2022
YOLOV4-DS-TRITON/got an error about input unmatch TensorRT	4	587	May 26, 2021

Yolov11 Triton Inference Server Deployment Problem

Related topics