Triton Inference Server Inference Request Error on GPU

skilic · September 17, 2021, 5:59am

Hello,

I am using Jetson Nano 4GB with Jetpack 4.6.

I need to deploy my deeplab semantic segmantation model to Triton Inference Server and then send an inference request to the deployed model.

Below you can see my Triton Inference Server configuration file for semantic segmentation model and information about deployment process of the segmentation model.

name: “segmentation”
platform: “onnxruntime_onnx”
max_batch_size : 0
input [
{
name: “ImageTensor:0”
data_type: TYPE_UINT8
dims: [1,1000,1000,3 ]
}
]
output [
{
name: “SemanticPredictions:0”
data_type: TYPE_INT32
dims: [1,-1,-1]
}
]

When I try to send an inference request (by using http client script) to the deployed segmentation model, I got some errors and then my inference request script is stopped. Below I am sharing the errors and resources usage of Nano during this process.

If I change the config file (below) of the model and deploy it over the CPU, the same inference request code works without errors.

name: “segmentation”
platform: “onnxruntime_onnx”
max_batch_size : 0
input [
{
name: “ImageTensor:0”
data_type: TYPE_UINT8
dims: [1,1000,1000,3 ]
}
]
output [
{
name: “SemanticPredictions:0”
data_type: TYPE_INT32
dims: [1,-1,-1]
}
]

instance_group [
{
count: 1
kind: KIND_CPU
}
]

What is the reason of this problem?

Why can not I run the inference code on the GPU, while I can run it on the CPU, ?

What do you suggest me to solve this problem?

Is this problem due to the power of hardware of the Jetson Nano? For example, If I use Xavier, will I encounter the same problem?

Thanks

AastaLLL · September 22, 2021, 2:35am

There is no update from you for a period, assuming this is not an issue any more.
Hence we are closing this topic. If need further support, please open a new one.
Thanks

Hi,

It seem that you are using a INT32 model.

Nano doesn’t support INT8/INT32 CUDA operation.
Could you try a float type model to see if the same issue occurs?

Thanks.

Topic		Replies	Views
Triton Inference Server Inference Request Error on GPU Triton Inference Server (archived) jetson-inference , python , nano , gpu , segmentation	1	1929	September 29, 2021
Triton Server Support for Jetson Nano Jetson Nano jetson-inference , inference-server-triton	3	2375	October 15, 2021
Getting error when doing inference using Triton Inference server on Jetson Nano Jetson Nano inference-server-triton	3	531	December 5, 2023
Getting CUDA Error while runnin inference in Jetson Nano 2GB Jetson Nano tensorrt , nano2gb	4	543	October 15, 2021
Jetson Orin Nano development kit with Triton Inference Server GPU problem DeepStream SDK yolo	3	74	January 20, 2026
Jetson Nano - Triton Inference Server DeepStream SDK	7	949	October 12, 2021
TensorRT Inference error on Jetson nano Jetson Nano tensorrt	28	3295	February 1, 2022
ERROR: TRTIS: failed to load model inception_graphdef Jetson Nano tensorrt	8	1727	October 18, 2021
Jetson Nano Out of Memory running TRT Model Jetson Nano tensorrt , tensorflow , inference-server-triton , deepstream	5	2273	December 22, 2021
Tensorflow_model_server at jetson nano Jetson Nano tensorflow	2	434	October 15, 2021

Triton Inference Server Inference Request Error on GPU

Related topics