Triton Server Error with TAO FasterRCNN model: Validation failed: libNamespace == nullptr

• Hardware: Ubuntu 22.04 RTX 4090
• Network Type: FasterRCNN TAO model
• TAO version: 5.5.0
Training spec:

# Copyright (c) 2017-2020, NVIDIA CORPORATION.  All rights reserved.
random_seed: 42

verbose: True
model_config {
input_image_config {
image_type: RGB
image_channel_order: 'bgr'
size_height_width {
height: 640
width: 640
    image_channel_mean {
        key: 'b'
        value: 103.939
    image_channel_mean {
        key: 'g'
        value: 116.779
    image_channel_mean {
        key: 'r'
        value: 123.68
image_scaling_factor: 1.0
max_objects_num_per_image: 100
arch: "resnet:18"
anchor_box_config {
scale: 64.0
scale: 128.0
scale: 256.0
ratio: 1.0
ratio: 0.5
ratio: 2.0
freeze_bn: True
freeze_blocks: 0
freeze_blocks: 1
roi_mini_batch: 256
rpn_stride: 16
use_bias: False
roi_pooling_config {
pool_size: 7
pool_size_2x: False
all_projections: True
dataset_config {
  data_sources: {
    tfrecords_path: "/workspace/tao-experiments/data/faster_rcnn/tfrecords/new_trainval/new_trainval*"
    image_directory_path: "/workspace/tao-experiments/data/training"
image_extension: 'png'
target_class_mapping {
key: 'item'
value: 'item'
target_class_mapping {
key: 'person'
value: 'person'
validation_fold: 0
augmentation_config {
preprocessing {
output_image_width: 640
output_image_height: 640
output_image_channel: 3
min_bbox_width: 1.0
min_bbox_height: 1.0
enable_auto_resize: True
spatial_augmentation {
hflip_probability: 0.5
vflip_probability: 0.0
zoom_min: 1.0
zoom_max: 1.0
translate_max_x: 0
translate_max_y: 0
color_augmentation {
hue_rotation_max: 0.0
saturation_shift_max: 0.0
contrast_scale_max: 0.0
contrast_center: 0.5
training_config {
visualizer {
    enabled: False
    num_images: 3
enable_augmentation: True
enable_qat: False
batch_size_per_gpu: 8
num_epochs: 12
rpn_min_overlap: 0.3
rpn_max_overlap: 0.7
classifier_min_overlap: 0.0
classifier_max_overlap: 0.5
gt_as_roi: False
std_scaling: 1.0
classifier_regr_std {
key: 'x'
value: 10.0
classifier_regr_std {
key: 'y'
value: 10.0
classifier_regr_std {
key: 'w'
value: 5.0
classifier_regr_std {
key: 'h'
value: 5.0

rpn_mini_batch: 256
rpn_pre_nms_top_N: 12000
rpn_nms_max_boxes: 2000
rpn_nms_overlap_threshold: 0.7

regularizer {
type: L2
weight: 1e-4

optimizer {
sgd {
lr: 0.02
momentum: 0.9
decay: 0.0
nesterov: False

learning_rate {
soft_start {
base_lr: 0.02
start_lr: 0.002
soft_start: 0.1
annealing_points: 0.8
annealing_points: 0.9
annealing_divider: 10.0

lambda_rpn_regr: 1.0
lambda_rpn_class: 1.0
lambda_cls_regr: 1.0
lambda_cls_class: 1.0
inference_config {
images_dir: '/workspace/tao-experiments/data/test_samples'
batch_size: 1
detection_image_output_dir: '/workspace/tao-experiments/faster_rcnn/inference_results_imgs_retrain'
labels_dump_dir: '/workspace/tao-experiments/faster_rcnn/inference_dump_labels_retrain'
rpn_pre_nms_top_N: 6000
rpn_nms_max_boxes: 300
rpn_nms_overlap_threshold: 0.7
object_confidence_thres: 0.0001
bbox_visualize_threshold: 0.6
classifier_nms_max_boxes: 100
classifier_nms_overlap_threshold: 0.3
nms_score_bits: 8
evaluation_config {
batch_size: 1
validation_period_during_training: 1
rpn_pre_nms_top_N: 6000
rpn_nms_max_boxes: 300
rpn_nms_overlap_threshold: 0.7
classifier_nms_max_boxes: 100
classifier_nms_overlap_threshold: 0.3
object_confidence_thres: 0.0001
gt_matching_iou_threshold: 0.5

Hello, I am having issues using transfer learning with the TAO FasterRCNN model, or more specifically with the Triton Inference Server after exporting as a TRT engine. I trained following the guidelines in the following notebook:

Training was successful and inference looked normal. However, when doing inference, I was receiving the error:

[02/10/2025-16:19:07] [TRT] [F] Validation failed: libNamespace == nullptr 
 [02/10/2025-16:19:07] [TRT] [E] std::exception

Note: I also received this error without any custom data and just the tutorial data, so to reproduce, you can use the tutorial data or I can send the tutorial model.

This error caused no issues with the inference using the TAO CLI. But when I attempted to launch a Triton Server instance with this model to test inference times, the server crashed due to this error. Is there a way to cause the server to ignore this validation issue or to fix this error with the model?

Do note this is a listed limitation with the TAO Toolkit 5.2.0 in the release notes of 5.3.0 as listed in the below link:

Also, I used Triton Server version 24.04 as it is the last with TensorRT 8, as the TAO toolkit does not currently support TRT 10 yet from what I can see. Here is the line used to launch the triton server:

docker run --gpus=1 --rm -p 8000:8000 -p 8001:8001 -p 8002:8002 -v /home/ubuntu-testing/model_repository:/models tritonserver --model-repository=/models

Here is my server config file for the model. I am not certain the output shapes are correct, but from what I can see it is not even getting to the config file before the server stops.

name: "FRCNN-resnet50"
platform: "tensorrt_plan"
max_batch_size : 0
input [
    name: "input_image"
    data_type: TYPE_FP16
    dims: [ 3, 640, 640 ]
    reshape { shape: [ 1, 3, 640, 640 ] }
output [
    name: "nms_out"
    data_type: TYPE_FP32
    dims: [ 1, 1, 100, 7 ]
    reshape { shape: [ 1, 1, 100, 7 ] }
    name: "nms_out_1"
    data_type: TYPE_FP32
    dims: [ 1, 1 , 1, 1]
    reshape { shape: [ 1, 1, 1, 1 ] }

And here is the output from the Triton server when it does not launch:

I0210 23:05:46.603013 1] Pinned memory pool is created at '0x7cb6d6000000' with size 268435456
I0210 23:05:46.604848 1] CUDA memory pool is created on device 0 with size 67108864
I0210 23:05:46.608765 1] loading: FRCNN-resnet50:1
I0210 23:05:46.634964 1] TRITONBACKEND_Initialize: tensorrt
I0210 23:05:46.634975 1] Triton TRITONBACKEND API version: 1.19
I0210 23:05:46.634977 1] 'tensorrt' TRITONBACKEND API version: 1.19
I0210 23:05:46.634979 1] backend configuration:
I0210 23:05:46.636848 1] TRITONBACKEND_ModelInitialize: FRCNN-resnet50 (version 1)
I0210 23:05:46.691943 1] Loaded engine size: 84 MiB
E0210 23:05:46.707516 1] Validation failed: libNamespace == nullptr

Thanks for your help!

It is a bug from TensorRT plugin code. Please sync latest TensorRT plugin code.

How would I do this? Did you mean on the triton server side or on the TAO image before the TRT engine is generated? Also, by syncing the plugin code do you mean updating the TRT version or something else?

Looks like I have been able to solve this by creating the trt engine with trtexec in TensorRT 10.8. The error no longer appears, but my server is still not starting. I have updated to Triton server version 25.01. Looks like maybe something else was the cause of the error. There are no failure messages though:


docker run --gpus=1 --rm -p 8000:8000 -p 8001:8001 -p 8002:8002 -v /home/ubuntu-testing/model_repository:/models tritonserver --model-repository=/models --log-verbose 1


I0211 17:34:51.850449 1] "Create CacheManager with cache_dir: '/opt/tritonserver/caches'"
I0211 17:34:51.986746 1] "Pinned memory pool is created at '0x7a4f1a000000' with size 268435456"
I0211 17:34:51.988590 1] "CUDA memory pool is created on device 0 with size 67108864"
I0211 17:34:51.992076 1] "Server side auto-completed config: "
name: "FRCNN-resnet50"
platform: "tensorrt_plan"
max_batch_size: 1
input {
  name: "input_image"
  data_type: TYPE_FP32
  dims: 1
  dims: 3
  dims: 640
  dims: 640
  reshape {
    shape: 1
    shape: 3
    shape: 640
    shape: 640
output {
  name: "nms_out"
  data_type: TYPE_FP32
  dims: 1
  dims: 1
  dims: 100
  dims: 7
output {
  name: "nms_out_1"
  data_type: TYPE_FP32
  dims: 1
  dims: 1
  dims: 1
  dims: 1
default_model_filename: "model.plan"
backend: "tensorrt"

I0211 17:34:51.992110 1] "loading: FRCNN-resnet50:1"
I0211 17:34:51.992182 1] "Adding default backend config setting: default-max-batch-size,4"
I0211 17:34:51.992201 1] "OpenLibraryHandle: /opt/tritonserver/backends/tensorrt/"
I0211 17:34:52.018443 1] "TRITONBACKEND_Initialize: tensorrt"
I0211 17:34:52.018457 1] "Triton TRITONBACKEND API version: 1.19"
I0211 17:34:52.018459 1] "'tensorrt' TRITONBACKEND API version: 1.19"
I0211 17:34:52.018461 1] "backend configuration:\n{\"cmdline\":{\"auto-complete-config\":\"true\",\"backend-directory\":\"/opt/tritonserver/backends\",\"min-compute-capability\":\"6.000000\",\"default-max-batch-size\":\"4\"}}"
I0211 17:34:52.018470 1] "Registering TensorRT Plugins"
I0211 17:34:52.020409 1] "TRITONBACKEND_ModelInitialize: FRCNN-resnet50 (version 1)"
I0211 17:34:52.020642 1] "ModelConfig 64-bit fields:"
I0211 17:34:52.020645 1] "\tModelConfig::dynamic_batching::default_priority_level"
I0211 17:34:52.020646 1] "\tModelConfig::dynamic_batching::default_queue_policy::default_timeout_microseconds"
I0211 17:34:52.020648 1] "\tModelConfig::dynamic_batching::max_queue_delay_microseconds"
I0211 17:34:52.020649 1] "\tModelConfig::dynamic_batching::priority_levels"
I0211 17:34:52.020651 1] "\tModelConfig::dynamic_batching::priority_queue_policy::key"
I0211 17:34:52.020652 1] "\tModelConfig::dynamic_batching::priority_queue_policy::value::default_timeout_microseconds"
I0211 17:34:52.020653 1] "\tModelConfig::ensemble_scheduling::step::model_version"
I0211 17:34:52.020655 1] "\tModelConfig::input::dims"
I0211 17:34:52.020656 1] "\tModelConfig::input::reshape::shape"
I0211 17:34:52.020657 1] "\tModelConfig::instance_group::secondary_devices::device_id"
I0211 17:34:52.020659 1] "\tModelConfig::model_warmup::inputs::value::dims"
I0211 17:34:52.020660 1] "\tModelConfig::optimization::cuda::graph_spec::graph_lower_bound::input::value::dim"
I0211 17:34:52.020661 1] "\tModelConfig::optimization::cuda::graph_spec::input::value::dim"
I0211 17:34:52.020663 1] "\tModelConfig::output::dims"
I0211 17:34:52.020664 1] "\tModelConfig::output::reshape::shape"
I0211 17:34:52.020665 1] "\tModelConfig::sequence_batching::direct::max_queue_delay_microseconds"
I0211 17:34:52.020667 1] "\tModelConfig::sequence_batching::max_sequence_idle_microseconds"
I0211 17:34:52.020668 1] "\tModelConfig::sequence_batching::oldest::max_queue_delay_microseconds"
I0211 17:34:52.020669 1] "\tModelConfig::sequence_batching::state::dims"
I0211 17:34:52.020670 1] "\tModelConfig::sequence_batching::state::initial_state::dims"
I0211 17:34:52.020672 1] "\tModelConfig::version_policy::specific::versions"
I0211 17:34:52.020732 1] "Setting the CUDA device to GPU0 to auto-complete config for FRCNN-resnet50"
I0211 17:34:52.021669 1] "Using explicit serialized file 'model.plan' to auto-complete config for FRCNN-resnet50"
I0211 17:34:52.078683 1] "Loaded engine size: 85 MiB"
I0211 17:34:52.099056 1] "Local registry did not find ProposalDynamic creator. Will try parent registry if enabled."
I0211 17:34:52.099067 1] "Global registry found ProposalDynamic creator."
I0211 17:34:52.099076 1] "Local registry did not find CropAndResizeDynamic creator. Will try parent registry if enabled."
I0211 17:34:52.099079 1] "Global registry found CropAndResizeDynamic creator."
I0211 17:34:52.099147 1] "Local registry did not find NMSDynamic_TRT creator. Will try parent registry if enabled."
I0211 17:34:52.099149 1] "Global registry found NMSDynamic_TRT creator."

Any insight? The server just quits after this output, am I missing something?

Do note I have tried loading the engine file into a python script which results in a segmentation fault (tested both on my own environment and the TensorRT container):

import time
import numpy as np
import tensorrt as trt
import pycuda.autoinit
import pycuda.driver as cuda

EXPLICIT_BATCH = 1 << (int)(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)
TRT_LOGGER = trt.Logger(trt.Logger.VERBOSE)

batch = 1
host_inputs  = []
cuda_inputs  = []
host_outputs = []
cuda_outputs = []
bindings = []

def Inference(engine):
    image = np.random.rand(1, 3, 640, 640)

    np.copyto(host_inputs[0], image)
    stream = cuda.Stream()
    context = engine.create_execution_context()

    start_time = time.time()
    cuda.memcpy_htod_async(cuda_inputs[0], host_inputs[0], stream)
    cuda.memcpy_dtoh_async(host_outputs[0], cuda_outputs[0], stream)
    print("execute times "+str(time.time()-start_time))

    output = host_outputs[0]

def PrepareEngine():
    trt.init_libnvinfer_plugins(trt.Logger(), '')
    with open('test.trt', 'rb') as f:
        serialized_engine =

    runtime = trt.Runtime(TRT_LOGGER)
    engine = runtime.deserialize_cuda_engine(serialized_engine)

    # create buffer
    for binding in engine:
        size = trt.volume(engine.get_tensor_shape(binding)) * batch
        host_mem = cuda.pagelocked_empty(shape=[size],dtype=np.float32)
        cuda_mem = cuda.mem_alloc(host_mem.nbytes)

        if engine.get_tensor_mode(binding)==trt.TensorIOMode.INPUT:

    return engine

if __name__ == "__main__":
    engine = PrepareEngine()

    engine = []

Which resulted in the following output:

[02/11/2025-14:34:27] [TRT] [I] Loaded engine size: 85 MiB
[02/11/2025-14:34:27] [TRT] [V] Local registry did not find ProposalDynamic creator. Will try parent registry if enabled.
[02/11/2025-14:34:27] [TRT] [V] Global registry found ProposalDynamic creator.
[02/11/2025-14:34:27] [TRT] [V] Local registry did not find CropAndResizeDynamic creator. Will try parent registry if enabled.
[02/11/2025-14:34:27] [TRT] [V] Global registry found CropAndResizeDynamic creator.
[02/11/2025-14:34:27] [TRT] [V] Local registry did not find NMSDynamic_TRT creator. Will try parent registry if enabled.
[02/11/2025-14:34:27] [TRT] [V] Global registry found NMSDynamic_TRT creator.
Segmentation fault (core dumped)

EDIT: I was able to get python code working correctly by initializing the plugins correctly in the line:

trt.init_libnvinfer_plugins(trt.Logger(), '')

Changing it to:

 trt.init_libnvinfer_plugins(TRT_LOGGER, '')

I am assuming maybe TRT plugins in the server are not being initialized correctly as the server stopped at the same point as this python code originally did?

trtexec works correctly with the following command:

/home/ubuntu-testing/TensorRT- --loadEngine='/home/ubuntu-testing/TensorRT-'


For the “Validation failed: libNamespace == nullptr” error, it is an issue in this version of tensorrt plugin code.
TensorRT/plugin/proposalPlugin/proposalPlugin.cpp at 23.08 · NVIDIA/TensorRT · GitHub and
TensorRT/plugin/proposalPlugin/proposalPlugin.cpp at 23.08 · NVIDIA/TensorRT · GitHub

PLUGIN_VALIDATE(libNamespace == nullptr);

should be

PLUGIN_VALIDATE(libNamespace != nullptr);

Issue is fixed after TRT 9.0 version.

You can modify the plugin code and rebuild, then replace the

Thanks for your answer. I was able to find a different workaround before your response as in my comment, by converting the onnx model to a TRT engine with TRT 10.8 trtexec. If you recommend sticking with TensorRT 8.6 I can try your method, do let me know. However, I still need assistance with the server not starting. With the 10.8 model, I am able to perform inference by loading in python, and trtexec passes, as in another of my previous comments. However the triton server is unable to start, as listed above, I assume because of not loading plugins properly. In python I used the command below to fix this:

 trt.init_libnvinfer_plugins(TRT_LOGGER, '')

Is there something I can do on the Triton Server to accomplish something similar? Such as adding something to the server environment variables for the plugin locations?

Note I have also tried generating the TRT engine inside of the triton server. Launching then prints out a segmentation fault as appeared in the python code:

root@78cb1e3d126e:/opt/tritonserver# tritonserver --model-repository=/opt/tritonserver/models --http-port=8000 --grpc-port=8001 --metrics-port=8002 --log-verbose 1
I0212 20:33:44.108490 297] "Create CacheManager with cache_dir: '/opt/tritonserver/caches'"
I0212 20:33:44.246535 297] "Pinned memory pool is created at '0x7b3de6000000' with size 268435456"
I0212 20:33:44.248288 297] "CUDA memory pool is created on device 0 with size 67108864"
I0212 20:33:44.251610 297] "Server side auto-completed config: "
name: "FRCNN-resnet50"
platform: "tensorrt_plan"
max_batch_size: 1
input {
  name: "input_image"
  data_type: TYPE_FP32
  dims: 1
  dims: 3
  dims: 640
  dims: 640
  reshape {
    shape: 1
    shape: 3
    shape: 640
    shape: 640
output {
  name: "nms_out"
  data_type: TYPE_FP32
  dims: 1
  dims: 1
  dims: 100
  dims: 7
output {
  name: "nms_out_1"
  data_type: TYPE_FP32
  dims: 1
  dims: 1
  dims: 1
  dims: 1
default_model_filename: "model.plan"
backend: "tensorrt"

I0212 20:33:44.251659 297] "loading: FRCNN-resnet50:1"
I0212 20:33:44.251754 297] "Adding default backend config setting: default-max-batch-size,4"
I0212 20:33:44.251774 297] "OpenLibraryHandle: /opt/tritonserver/backends/tensorrt/"
I0212 20:33:44.268152 297] "TRITONBACKEND_Initialize: tensorrt"
I0212 20:33:44.268174 297] "Triton TRITONBACKEND API version: 1.19"
I0212 20:33:44.268178 297] "'tensorrt' TRITONBACKEND API version: 1.19"
I0212 20:33:44.268180 297] "backend configuration:\n{\"cmdline\":{\"auto-complete-config\":\"true\",\"backend-directory\":\"/opt/tritonserver/backends\",\"min-compute-capability\":\"6.000000\",\"default-max-batch-size\":\"4\"}}"
I0212 20:33:44.268189 297] "Registering TensorRT Plugins"
I0212 20:33:44.270108 297] "TRITONBACKEND_ModelInitialize: FRCNN-resnet50 (version 1)"
I0212 20:33:44.270330 297] "ModelConfig 64-bit fields:"
I0212 20:33:44.270333 297] "\tModelConfig::dynamic_batching::default_priority_level"
I0212 20:33:44.270336 297] "\tModelConfig::dynamic_batching::default_queue_policy::default_timeout_microseconds"
I0212 20:33:44.270338 297] "\tModelConfig::dynamic_batching::max_queue_delay_microseconds"
I0212 20:33:44.270340 297] "\tModelConfig::dynamic_batching::priority_levels"
I0212 20:33:44.270342 297] "\tModelConfig::dynamic_batching::priority_queue_policy::key"
I0212 20:33:44.270344 297] "\tModelConfig::dynamic_batching::priority_queue_policy::value::default_timeout_microseconds"
I0212 20:33:44.270346 297] "\tModelConfig::ensemble_scheduling::step::model_version"
I0212 20:33:44.270349 297] "\tModelConfig::input::dims"
I0212 20:33:44.270350 297] "\tModelConfig::input::reshape::shape"
I0212 20:33:44.270352 297] "\tModelConfig::instance_group::secondary_devices::device_id"
I0212 20:33:44.270354 297] "\tModelConfig::model_warmup::inputs::value::dims"
I0212 20:33:44.270356 297] "\tModelConfig::optimization::cuda::graph_spec::graph_lower_bound::input::value::dim"
I0212 20:33:44.270358 297] "\tModelConfig::optimization::cuda::graph_spec::input::value::dim"
I0212 20:33:44.270360 297] "\tModelConfig::output::dims"
I0212 20:33:44.270362 297] "\tModelConfig::output::reshape::shape"
I0212 20:33:44.270364 297] "\tModelConfig::sequence_batching::direct::max_queue_delay_microseconds"
I0212 20:33:44.270366 297] "\tModelConfig::sequence_batching::max_sequence_idle_microseconds"
I0212 20:33:44.270368 297] "\tModelConfig::sequence_batching::oldest::max_queue_delay_microseconds"
I0212 20:33:44.270370 297] "\tModelConfig::sequence_batching::state::dims"
I0212 20:33:44.270373 297] "\tModelConfig::sequence_batching::state::initial_state::dims"
I0212 20:33:44.270375 297] "\tModelConfig::version_policy::specific::versions"
I0212 20:33:44.270429 297] "Setting the CUDA device to GPU0 to auto-complete config for FRCNN-resnet50"
I0212 20:33:44.271413 297] "Using explicit serialized file 'model.plan' to auto-complete config for FRCNN-resnet50"
I0212 20:33:44.321936 297] "Loaded engine size: 85 MiB"
I0212 20:33:44.341177 297] "Local registry did not find ProposalDynamic creator. Will try parent registry if enabled."
I0212 20:33:44.341192 297] "Global registry found ProposalDynamic creator."
I0212 20:33:44.341201 297] "Local registry did not find CropAndResizeDynamic creator. Will try parent registry if enabled."
I0212 20:33:44.341205 297] "Global registry found CropAndResizeDynamic creator."
I0212 20:33:44.341274 297] "Local registry did not find NMSDynamic_TRT creator. Will try parent registry if enabled."
I0212 20:33:44.341278 297] "Global registry found NMSDynamic_TRT creator."
Segmentation fault (core dumped)

For running triton with TAO model, there is an official github. GitHub - NVIDIA-AI-IOT/tao-toolkit-triton-apps: Sample app code for deploying TAO Toolkit trained models to Triton. Could you try to run with it and leverage?

I’m sure those models will work on your link. However, I am working with TAO FasterRCNN, which is not one of the ones they provide an example for, which I am assuming will end up having the same issue I am facing here. Any input on my previous comment? Do you believe TensorRT 8 would solve my issue?

In official tao_tritron github, the libnvinfer_plugin is built and then replaced. See its docker file tao-toolkit-triton-apps/docker/Dockerfile at 9a30f9692bf29fb728520e9dba1c79be2bf65e74 · NVIDIA-AI-IOT/tao-toolkit-triton-apps · GitHub.

So, for your own triton server, it is also needed to build the tensorrt plugin and make sure it is replaced in your own triton server.

The Triton server crashes with a segmentation fault after registering TensorRT plugins, specifically when trying to load the engine. This strongly suggests an issue with plugin loading or compatibility within the Triton environment. The fact that trtexec passes and Python inference works implies the engine itself is valid, but there’s a discrepancy when Triton attempts to use it.


  • Explicit Plugin Path: While Triton registers the plugins, it might not be finding them correctly during engine execution. Try explicitly setting the LD_LIBRARY_PATH environment variable within the Triton container to include the path to the TensorRT plugin libraries. This is the closest equivalent to the trt.init_libnvinfer_plugins() call in Python. You need to find where those plugins reside within the container. It’s often something like /usr/lib/x86_64-linux-gnu or /opt/tensorrt/lib.
docker run -d --name triton_server \
  --gpus all \
  -p 8000:8000 \
  -p 8001:8001 \
  -p 8002:8002 \
  -v /path/to/your/models:/opt/tritonserver/models \
  -e LD_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu:/opt/tensorrt/lib:$LD_LIBRARY_PATH \<YOUR_TRITON_VERSION> tritonserver --model-repository=/opt/tritonserver/models --http-port=8000 --grpc-port=8001 --metrics-port=8002 --log-verbose 1

Replace <YOUR_TRITON_VERSION> with the specific version you’re using. Replace /usr/lib/x86_64-linux-gnu:/opt/tensorrt/lib with the actual path to your TensorRT’s plugin libraries.


  • Triton and TensorRT Mismatch: This is the most likely culprit. Triton Server has a specific dependency on a particular TensorRT version. Even though you built the engine with TensorRT 10.8 and it runs in Python, the Triton container might be using an older TensorRT version. This is a very common cause of segmentation faults.
    • Identify Triton’s TensorRT Version: The easiest way to determine the TensorRT version inside the Triton container is to inspect the container’s environment variables or check the Triton server logs when it starts. It should print the TensorRT version it’s using. Look for something like TensorRT version: 8.6.1.
    • Match TensorRT Versions: The ideal solution is to use a Triton container that’s built against the same TensorRT version you used to create the engine (10.8 in your case). NVIDIA provides various Triton containers; select one that matches. If a matching container isn’t available, you might need to build your own Triton container from source to ensure compatibility.

Thank you for your response. I will take a look at these options. I am certain the TRT versions match, but I will check the other items.