RetinaNet trained with taotoolkit cannot be run on the triton server when converting with TensorRT 10.04

Recently, I updated the JetPack on my Orin Nano from JetPack 5 to JetPack 6. After the update, I noticed that a model I had previously running (a RetinaNet model trained with TAO Toolkit 5.3) no longer loads on the Triton server. The Triton server successfully converts the model from an ONNX file to model.plan, but when it tries to load the model from model.plan, it crashes unexpectedly without any additional information.

I tested this model again on the previous JetPack version with TensorRT 8.6, and everything works as expected. Are there any known changes in the new TensorRT version that might cause compatibility issues with TAO Toolkit’s RetinaNet model?

• Hardware: Orin Nano with JetPack 6
• Network Type: RetinaNet trained with Tao Toolkit
• How to reproduce the issue ?: Loading model into the triton server with TensorRT 10.04

Logs:
I1028 07:34:25.875463 1 shared_library.cc:112] “OpenLibraryHandle: /opt/tritonserver/repoagents/trtconverter/libtritonrepoagent_trtconverter.so”
I1028 07:34:25.876578 1 model_config_utils.cc:716] “Server side auto-completed config: "
name: “ball_tracking_v2-internal”
platform: “tensorrt_plan”
input {
name: “Input”
data_type: TYPE_FP32
dims: 3
dims: 160
dims: 320
}
output {
name: “NMS”
data_type: TYPE_FP32
dims: 1
dims: 200
dims: 7
}
output {
name: “NMS_1”
data_type: TYPE_FP32
dims: 1
dims: 1
dims: 1
}
default_model_filename: “model.plan”
model_warmup {
name: “regular sample”
batch_size: 1
inputs {
key: “Input”
value {
data_type: TYPE_FP32
dims: 3
dims: 160
dims: 320
random_data: true
}
}
}
backend: “tensorrt”
model_repository_agents {
agents {
name: “trtconverter”
parameters {
key: “./1/retinanet_balltracking_us_data.onnx”
value: " --fp16”
}
}
}

I1028 07:34:25.876814 1 model_lifecycle.cc:441] “AsyncLoad() ‘ball_tracking_v2-internal’”
I1028 07:34:25.876972 1 model_lifecycle.cc:472] “loading: ball_tracking_v2-internal:1”
I1028 07:34:25.877072 1 model_lifecycle.cc:441] “AsyncLoad() ‘ball_tracking_v2’”
I1028 07:34:25.877215 1 model_lifecycle.cc:472] “loading: ball_tracking_v2:1”
I1028 07:34:25.877338 1 model_lifecycle.cc:551] “CreateModel() ‘ball_tracking_v2-internal’ version 1”
I1028 07:34:25.877658 1 backend_model.cc:503] “Adding default backend config setting: default-max-batch-size,4”
I1028 07:34:25.877772 1 shared_library.cc:112] “OpenLibraryHandle: /opt/tritonserver/backends/tensorrt/libtriton_tensorrt.so”
I1028 07:34:25.881588 1 model_lifecycle.cc:551] “CreateModel() ‘ball_tracking_v2’ version 1”
I1028 07:34:25.881814 1 backend_model.cc:503] “Adding default backend config setting: default-max-batch-size,4”
I1028 07:34:25.906358 1 tensorrt.cc:65] “TRITONBACKEND_Initialize: tensorrt”
I1028 07:34:25.906433 1 tensorrt.cc:75] “Triton TRITONBACKEND API version: 1.19”
I1028 07:34:25.906443 1 tensorrt.cc:81] “‘tensorrt’ TRITONBACKEND API version: 1.19”
I1028 07:34:25.906452 1 tensorrt.cc:105] “backend configuration:\n{"cmdline":{"auto-complete-config":"true","backend-directory":"/opt/tritonserver/backends","min-compute-capability":"5.300000","default-max-batch-size":"4"}}”
I1028 07:34:25.906489 1 tensorrt.cc:187] “Registering TensorRT Plugins”
I1028 07:34:25.906536 1 logging.cc:49] “Registered plugin creator - ::BatchedNMSDynamic_TRT version 1”
I1028 07:34:25.906552 1 logging.cc:49] “Registered plugin creator - ::BatchedNMS_TRT version 1”
I1028 07:34:25.906566 1 logging.cc:49] “Registered plugin creator - ::BatchTilePlugin_TRT version 1”
I1028 07:34:25.906583 1 logging.cc:49] “Registered plugin creator - ::Clip_TRT version 1”
I1028 07:34:25.906613 1 logging.cc:49] “Registered plugin creator - ::CoordConvAC version 1”
I1028 07:34:25.906630 1 logging.cc:49] “Registered plugin creator - ::CropAndResizeDynamic version 1”
I1028 07:34:25.906644 1 logging.cc:49] “Registered plugin creator - ::CropAndResize version 1”
I1028 07:34:25.906658 1 logging.cc:49] “Registered plugin creator - ::DecodeBbox3DPlugin version 1”
I1028 07:34:25.906671 1 logging.cc:49] “Registered plugin creator - ::DetectionLayer_TRT version 1”
I1028 07:34:25.906684 1 logging.cc:49] “Registered plugin creator - ::EfficientNMS_Explicit_TF_TRT version 1”
I1028 07:34:25.906696 1 logging.cc:49] “Registered plugin creator - ::EfficientNMS_Implicit_TF_TRT version 1”
I1028 07:34:25.906710 1 logging.cc:49] “Registered plugin creator - ::EfficientNMS_ONNX_TRT version 1”
I1028 07:34:25.906723 1 logging.cc:49] “Registered plugin creator - ::EfficientNMS_TRT version 1”
I1028 07:34:25.906744 1 logging.cc:49] “Registered plugin creator - ::FlattenConcat_TRT version 1”
I1028 07:34:25.906762 1 logging.cc:49] “Registered plugin creator - ::GenerateDetection_TRT version 1”
I1028 07:34:25.906777 1 logging.cc:49] “Registered plugin creator - ::GridAnchor_TRT version 1”
I1028 07:34:25.906789 1 logging.cc:49] “Registered plugin creator - ::GridAnchorRect_TRT version 1”
I1028 07:34:25.906802 1 logging.cc:49] “Registered plugin creator - ::InstanceNormalization_TRT version 1”
I1028 07:34:25.906816 1 logging.cc:49] “Registered plugin creator - ::InstanceNormalization_TRT version 2”
I1028 07:34:25.906831 1 logging.cc:49] “Registered plugin creator - ::InstanceNormalization_TRT version 3”
I1028 07:34:25.906846 1 logging.cc:49] “Registered plugin creator - ::LReLU_TRT version 1”
I1028 07:34:25.906860 1 logging.cc:49] “Registered plugin creator - ::ModulatedDeformConv2d version 1”
I1028 07:34:25.906873 1 logging.cc:49] “Registered plugin creator - ::MultilevelCropAndResize_TRT version 1”
I1028 07:34:25.906888 1 logging.cc:49] “Registered plugin creator - ::MultilevelProposeROI_TRT version 1”
I1028 07:34:25.906900 1 logging.cc:49] “Registered plugin creator - ::MultiscaleDeformableAttnPlugin_TRT version 1”
I1028 07:34:25.906923 1 logging.cc:49] “Registered plugin creator - ::NMSDynamic_TRT version 1”
I1028 07:34:25.906936 1 logging.cc:49] “Registered plugin creator - ::NMS_TRT version 1”
I1028 07:34:25.906948 1 logging.cc:49] “Registered plugin creator - ::Normalize_TRT version 1”
I1028 07:34:25.906960 1 logging.cc:49] “Registered plugin creator - ::PillarScatterPlugin version 1”
I1028 07:34:25.906972 1 logging.cc:49] “Registered plugin creator - ::PriorBox_TRT version 1”
I1028 07:34:25.906984 1 logging.cc:49] “Registered plugin creator - ::ProposalDynamic version 1”
I1028 07:34:25.906995 1 logging.cc:49] “Registered plugin creator - ::ProposalLayer_TRT version 1”
I1028 07:34:25.907010 1 logging.cc:49] “Registered plugin creator - ::Proposal version 1”
I1028 07:34:25.907023 1 logging.cc:49] “Registered plugin creator - ::PyramidROIAlign_TRT version 1”
I1028 07:34:25.907034 1 logging.cc:49] “Registered plugin creator - ::Region_TRT version 1”
I1028 07:34:25.907044 1 logging.cc:49] “Registered plugin creator - ::Reorg_TRT version 2”
I1028 07:34:25.907054 1 logging.cc:49] “Registered plugin creator - ::Reorg_TRT version 1”
I1028 07:34:25.907065 1 logging.cc:49] “Registered plugin creator - ::ResizeNearest_TRT version 1”
I1028 07:34:25.907075 1 logging.cc:49] “Registered plugin creator - ::ROIAlign_TRT version 1”
I1028 07:34:25.907089 1 logging.cc:49] “Registered plugin creator - ::ROIAlign_TRT version 2”
I1028 07:34:25.907103 1 logging.cc:49] “Registered plugin creator - ::RPROI_TRT version 1”
I1028 07:34:25.907116 1 logging.cc:49] “Registered plugin creator - ::ScatterElements version 2”
I1028 07:34:25.907128 1 logging.cc:49] “Registered plugin creator - ::ScatterElements version 1”
I1028 07:34:25.907145 1 logging.cc:49] “Registered plugin creator - ::ScatterND version 1”
I1028 07:34:25.907156 1 logging.cc:49] “Registered plugin creator - ::SpecialSlice_TRT version 1”
I1028 07:34:25.907165 1 logging.cc:49] “Registered plugin creator - ::Split version 1”
I1028 07:34:25.907176 1 logging.cc:49] “Registered plugin creator - ::VoxelGeneratorPlugin version 1”
I1028 07:34:25.907398 1 tensorrt.cc:231] “TRITONBACKEND_ModelInitialize: ball_tracking_v2-internal (version 1)”
I1028 07:34:25.907501 1 shared_library.cc:112] “OpenLibraryHandle: /opt/tritonserver/backends/python/libtriton_python.so”
I1028 07:34:25.908526 1 model_config_utils.cc:1941] “ModelConfig 64-bit fields:”
I1028 07:34:25.908559 1 model_config_utils.cc:1943] “\tModelConfig::dynamic_batching::default_priority_level”
I1028 07:34:25.908566 1 model_config_utils.cc:1943] “\tModelConfig::dynamic_batching::default_queue_policy::default_timeout_microseconds”
I1028 07:34:25.908572 1 model_config_utils.cc:1943] “\tModelConfig::dynamic_batching::max_queue_delay_microseconds”
I1028 07:34:25.908578 1 model_config_utils.cc:1943] “\tModelConfig::dynamic_batching::priority_levels”
I1028 07:34:25.908583 1 model_config_utils.cc:1943] “\tModelConfig::dynamic_batching::priority_queue_policy::key”
I1028 07:34:25.908588 1 model_config_utils.cc:1943] “\tModelConfig::dynamic_batching::priority_queue_policy::value::default_timeout_microseconds”
I1028 07:34:25.908594 1 model_config_utils.cc:1943] “\tModelConfig::ensemble_scheduling::step::model_version”
I1028 07:34:25.908599 1 model_config_utils.cc:1943] “\tModelConfig::input::dims”
I1028 07:34:25.908604 1 model_config_utils.cc:1943] “\tModelConfig::input::reshape::shape”
I1028 07:34:25.908610 1 model_config_utils.cc:1943] “\tModelConfig::instance_group::secondary_devices::device_id”
I1028 07:34:25.908615 1 model_config_utils.cc:1943] “\tModelConfig::model_warmup::inputs::value::dims”
I1028 07:34:25.908620 1 model_config_utils.cc:1943] “\tModelConfig::optimization::cuda::graph_spec::graph_lower_bound::input::value::dim”
I1028 07:34:25.908626 1 model_config_utils.cc:1943] “\tModelConfig::optimization::cuda::graph_spec::input::value::dim”
I1028 07:34:25.908631 1 model_config_utils.cc:1943] “\tModelConfig::output::dims”
I1028 07:34:25.908636 1 model_config_utils.cc:1943] “\tModelConfig::output::reshape::shape”
I1028 07:34:25.908641 1 model_config_utils.cc:1943] “\tModelConfig::sequence_batching::direct::max_queue_delay_microseconds”
I1028 07:34:25.908647 1 model_config_utils.cc:1943] “\tModelConfig::sequence_batching::max_sequence_idle_microseconds”
I1028 07:34:25.908652 1 model_config_utils.cc:1943] “\tModelConfig::sequence_batching::oldest::max_queue_delay_microseconds”
I1028 07:34:25.908657 1 model_config_utils.cc:1943] “\tModelConfig::sequence_batching::state::dims”
I1028 07:34:25.908663 1 model_config_utils.cc:1943] “\tModelConfig::sequence_batching::state::initial_state::dims”
I1028 07:34:25.908668 1 model_config_utils.cc:1943] “\tModelConfig::version_policy::specific::versions”
I1028 07:34:25.908891 1 model_state.cc:317] “Setting the CUDA device to GPU0 to auto-complete config for ball_tracking_v2-internal”
I1028 07:34:25.908988 1 model_state.cc:363] “Using explicit serialized file ‘model.plan’ to auto-complete config for ball_tracking_v2-internal”
I1028 07:34:25.912464 1 python_be.cc:1618] “‘python’ TRITONBACKEND API version: 1.19”
I1028 07:34:25.912516 1 python_be.cc:1640] “backend configuration:\n{"cmdline":{"auto-complete-config":"true","backend-directory":"/opt/tritonserver/backends","min-compute-capability":"5.300000","default-max-batch-size":"4"}}”
I1028 07:34:25.912781 1 python_be.cc:1778] “Shared memory configuration is shm-default-byte-size=1048576,shm-growth-byte-size=1048576,stub-timeout-seconds=30”
I1028 07:34:25.913663 1 python_be.cc:2075] “TRITONBACKEND_GetBackendAttribute: setting attributes”
I1028 07:34:25.913910 1 python_be.cc:1879] “TRITONBACKEND_ModelInitialize: ball_tracking_v2 (version 1)”
I1028 07:34:25.916027 1 stub_launcher.cc:385] “Starting Python backend stub: exec /opt/tritonserver/backends/python/triton_python_backend_stub /models/ball_tracking_v2/1/model.py triton_python_backend_shm_region_ae800121-f883-4f3f-9123-55a4aa4cf3ee 1048576 1048576 1 /opt/tritonserver/backends/python 336 ball_tracking_v2 DEFAULT”
I1028 07:34:25.952927 1 logging.cc:46] “Loaded engine size: 33 MiB”
I1028 07:34:26.002538 1 logging.cc:49] “Local registry did not find NMSDynamic_TRT creator. Will try parent registry if enabled.”
I1028 07:34:26.002605 1 logging.cc:49] “Global registry found NMSDynamic_TRT creator.”

For Jetpack5 and Jetpack6, they have some differences. For example, Ubuntu version is from 20.04 to 22.04. Also, TensorRT version is changed.
You can run $dkpg -l |grep cuda in Orin Nano to confirm.

So, the model.plan(i.e. , tensorrt engine) is needed to generated under the new TensorRT version.

1 Like

Okay so maybe some more background.

  1. First I train the RetinaNet model with Tao Toolkit
  2. I export the onnx model using tao model retinanet export -m /results/run1/weights/retinanet_resnet18_epoch_100.hdf5 -e /specs/retinanet_train_resnet18.txt
  3. I’m generating the model.plan from my onnx model on the host machine
  4. I load the model.plan model to triton server

These are the exact steps I followed with JetPack 5, and everything worked perfectly. However, after updating to JetPack 6, I can no longer load the model into the Triton server (see logs above). The model.plan file is now also generated with the updated TensorRT version.

Could there be significant differences between TensorRT 5 and 6 that might be causing this issue?

1 Like

If your host machine has the same TensorRT version as triton server, there is no issue.
Suggest you to generate model.plan where you are going to run inference. I think it is triton server. You can login the triton server and generate tensorrt engine.

I’m generating model.plan on the host machine using triton server on which it should be running, so this is not it unfortunately.

Suggest you to docker run into this 10.04 docker and generate tensorrt engine inside it.

Yes, I understand. I’m already doing this. I was trying to automate it using agents and also tried manually. The TensorRT engine is generating the model plan, but Triton cannot load it (I’m getting the logs I mentioned in the topic, and that’s all the information I have). I also want to add that this procedure was working with the older version of Jetpack with TensorRT 8.6. Additionally, I’ve just tested this process with EfficientDet (a TensorFlow 2 model), and it’s also working. Does TensorRT 10.04 support TensorFlow 1 models (such as RetinaNet)?

For TAO-tf1 models, the latest verified version for TensorRT is 8.6.3. You can find this info in 5.5 TAO-deploy docker. For TensorRT 10.04, the status is unknown, not sure if there is any issue.
Thus, you may check if it is possible to run a triton server which is based on TensorRT8.6.

Okay so looking on our discussion and problems I have I assume that for the new TensorRT 10.04 the old models from Tensorflow 1 are not supported in triton server, because I cannot see any other difference from the process perspective.

TAO-deploy docker will support TensorRT 10 in future release. So, the old models should work in TensorRT 10.

Super, thank you for clarifying that. I will be waiting for the updates on this matter.

BTW, the official triton server for TAO is shared in GitHub - NVIDIA-AI-IOT/tao-toolkit-triton-apps: Sample app code for deploying TAO Toolkit trained models to Triton. It can run inference against the Retinanet engines.
Currently, it is based on nvcr.io/nvidia/tritonserver:23.02-py3 and its TRT version is 8.x. You can find the docker file in tao-toolkit-triton-apps/docker/Dockerfile at main · NVIDIA-AI-IOT/tao-toolkit-triton-apps · GitHub.
If you have bandwidth, you can update base docker to Tensorrt10 and also update other packages for debug use.

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.