Triton Inference Server Inference Request Error on GPU

skilic · September 16, 2021, 2:59pm

Hello,

I am using Jetson Nano 4GB with Jetpack 4.6.

I need to deploy my deeplab semantic segmantation model to Triton Inference Server and then send an inference request to the deployed model.

Below you can see my Triton Inference Server configuration file for semantic segmentation model and information about deployment process of the segmentation model.

name: “segmentation”
platform: “onnxruntime_onnx”
max_batch_size : 0
input [
{
name: “ImageTensor:0”
data_type: TYPE_UINT8
dims: [1,1000,1000,3 ]
}
]
output [
{
name: “SemanticPredictions:0”
data_type: TYPE_INT32
dims: [1,-1,-1]
}
]

When I try to send an inference request (by using http client script) to the deployed segmentation model, I got some errors and then my inference request script is stopped. Below I am sharing the errors and resources usage of Nano during this process.

If I change the config file (below) of the model and deploy it over the CPU, the same inference request code works without errors.

name: “segmentation”
platform: “onnxruntime_onnx”
max_batch_size : 0
input [
{
name: “ImageTensor:0”
data_type: TYPE_UINT8
dims: [1,1000,1000,3 ]
}
]
output [
{
name: “SemanticPredictions:0”
data_type: TYPE_INT32
dims: [1,-1,-1]
}
]

instance_group [
{
count: 1
kind: KIND_CPU
}
]

What is the reason of this problem?

Why can not I run the inference code on the GPU, while I can run it on the CPU, ?

What do you suggest me to solve this problem?

Is this problem due to the power of hardware of the Jetson Nano? For example, If I use Xavier, will I encounter the same problem?

Thanks

nadeemm · September 29, 2021, 5:30pm

Please re-post your question on: Triton Inference Server · GitHub , the NVIDIA and other teams will be able to help you there.
Sorry for the inconvenience, thanks for your patience.

Topic		Replies	Views
Triton Inference Server Inference Request Error on GPU Jetson Nano inference-server-triton , nano , gpu , segmentation	2	877	October 15, 2021
Getting error when doing inference using Triton Inference server on Jetson Nano Jetson Nano inference-server-triton	3	450	December 5, 2023
Problem with accumulating gpu memory usage in tritonserver TensorRT cudnn , inference-server-triton , deepstream	0	69	September 3, 2024
[error] when DeepsTream`s container using Triton Inference Server through gRPC,Segmentation fault (core dumped) DeepStream SDK	11	1069	March 9, 2022
Triton Inference through docker DeepStream SDK	6	1404	March 16, 2022
Device memory is insufficient for Jetson example Jetson Nano jetson-inference	3	1313	March 23, 2022
Jetson Nano - Triton Inference Server DeepStream SDK	7	829	October 12, 2021
Memory usage when loading unet for inference on jetson nano DeepStream SDK	3	424	September 18, 2021
Triton-server model load balancing DeepStream SDK inference-server-triton	6	938	February 8, 2023
Tensorflow Segmentation Model Deployment On Triton Inference Server Triton Inference Server - archived	1	1220	September 30, 2021

Triton Inference Server Inference Request Error on GPU

Related topics