Triton Inference Server's health status shows 'Connection peer reset'

sheetal.vishwakarma · January 13, 2021, 2:32pm

Hi,

Description

Facing error while connecting to Triton inference server(seems server startup is having errors)

Environment

• Hardware Platform (GPU): NVIDIA 2080 TI
• DeepStream Version: 5.0
• NVIDIA GPU Driver Version : 450.102.04
• Issue Type: question

Steps To Reproduce

I am trying to use Triton Inference Server for image classification. The docker image that I have tried are:

I have run the server using following command:
sudo docker run --gpus all -it -d -p8000:8000 -p8001:8001 -p8002:8002 -v:/models tritonserver --model-repository=/models

Now, when I try to check the health status of the running Triton server using:
curl -v localhost:8000/v2/health/ready

It connects for a moment and then give: connection reset error:

Trying 127.0.0.1…
TCP_NODELAY set
Connected to localhost (127.0.0.1) port 8000 (#0)

GET /v2/health/ready HTTP/1.1
Host: localhost:8000
User-Agent: curl/7.58.0
Accept: /

Recv failure: Connection reset by peer
stopped the pause stream!
Closing connection 0
curl: (56) Recv failure: Connection reset by peer

I have also tried to run classification on image on triton-inference-server using triton-client setup outside the triton container.

python3 python/image_client.py -m inception_graphdef -s INCEPTION qa/images/mug.jpg

But it gives following traceback
Traceback (most recent call last):
File “/home/ubuntu/.local/lib/python3.6/site-packages/geventhttpclient/response.py”, line 190, in _read_headers
data = self._sock.recv(self.block_size)
File “/home/ubuntu/.local/lib/python3.6/site-packages/gevent/_socketcommon.py”, line 657, in recv
return self._sock.recv(*args)
ConnectionResetError: [Errno 104] Connection reset by peer

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “./python/image_client.py”, line 401, in
model_name=FLAGS.model_name, model_version=FLAGS.model_version)
File “/home/ubuntu/.local/lib/python3.6/site-packages/tritonclient/http/init.py”, line 494, in get_model_metadata
query_params=query_params)
File “/home/ubuntu/.local/lib/python3.6/site-packages/tritonclient/http/init.py”, line 258, in _get
response = self._client_stub.get(request_uri)
File “/home/ubuntu/.local/lib/python3.6/site-packages/geventhttpclient/client.py”, line 266, in get
return self.request(METHOD_GET, request_uri, headers=headers)
File “/home/ubuntu/.local/lib/python3.6/site-packages/geventhttpclient/client.py”, line 260, in request
raise e
File “/home/ubuntu/.local/lib/python3.6/site-packages/geventhttpclient/client.py”, line 254, in request
block_size=self.block_size, method=method.upper(), headers_type=self.headers_type)
File “/home/ubuntu/.local/lib/python3.6/site-packages/geventhttpclient/response.py”, line 298, in init
super(HTTPSocketPoolResponse, self).init(sock, **kw)
File “/home/ubuntu/.local/lib/python3.6/site-packages/geventhttpclient/response.py”, line 170, in init
self._read_headers()
File “/home/ubuntu/.local/lib/python3.6/site-packages/geventhttpclient/response.py”, line 204, in _read_headers
‘connection closed.’)
geventhttpclient.response.HTTPConnectionClosed: connection closed.

Have tried latest and older version of triton-inference-server, in both it is giving the same error.

Looking out for some hint what is going wrong with the setup. In case any other information needed let me know.

NVES · January 13, 2021, 2:37pm

Hi, Request you to share your model and script, so that we can help you better.

Alternatively, you can try running your model with trtexec command.
https://github.com/NVIDIA/TensorRT/tree/master/samples/opensource/trtexec

Thanks!

sheetal.vishwakarma · January 13, 2021, 2:56pm

Hi,
Just to start with I am using the sample models provided inside triton-inference-server docker container present at:
/opt/nvidia/deepstream/deepstream-5.0/samples/models

which includes following models:
Primary_Detector/
Secondary_CarColor/
Secondary_CarMake/
Secondary_VehicleTypes/
Segmentation_Industrial/
Segmentation_Semantic/
densenet_onnx/
inception_graphdef/
mobilenet_v1/
ssd_inception_v2_coco_2018_01_28/
ssd_mobilenet_v1_coco_2018_01_28/

I have arranged the models in the same hierarchy as defined in the docs

sheetal.vishwakarma · January 14, 2021, 8:27pm

Hi,

As the deepstream+triton server docker was not running properly, We tried running the Triton Inference Server docker image without Deepstream-5.0, using following command.
docker pull nvcr.io/nvidia/tritonserver:20.03.1-py3

And this time used triton-client sdk docker image to send inference request. Used following client image:
docker pull nvcr.io/nvidia/tritonserver:20.-py3-sdk

With this the model loaded successfully and was able to run the sample models and successfully ran inference on sample models. But when we added our classification model with two classes (trained using TensorRT with Resnet18 as meta architecture) it loaded successfully.

When I tried running classification on it, it was giving 50% precision for both classes on major number of images and for the one it has given higher precision is wrong result:
Batch size change krne pr :
Request 13, batch size 5
Image ‘/workspace/images/testing_img//9ch2_2.jpg’:
1 (OTHERS) = 0.995663
0 (CLEANING) = 0.0043374
Image ‘/workspace/images/testing_img//9ch3_2.jpg’:
0 (CLEANING) = 0.5
1 (OTHERS) = 0.5
Image ‘/workspace/images/testing_img//10ch2_2.jpg’:
0 (CLEANING) = 0.5
1 (OTHERS) = 0.5
Image ‘/workspace/images/testing_img//10ch3_2.jpg’:
0 (CLEANING) = 0.5
1 (OTHERS) = 0.5
Image ‘/workspace/images/testing_img//11ch2_2.jpg’:
0 (CLEANING) = 0.5
1 (OTHERS) = 0.5

Model config is attached config.pbtxt (442 Bytes)

Please let me know if there is some gaps in config file or any other settings as the same model when tested using deepstream is working correctly.

Thanks

spolisetty · January 15, 2021, 9:40am

Hi @sheetal.vishwakarma,

This doesn’t look like a trt issue. Please post your query in related forum.

Thank you.

sheetal.vishwakarma · January 15, 2021, 2:01pm

Moved

sheetal.vishwakarma · January 18, 2021, 8:58am

Gentle Reminder!

Topic		Replies	Views
Triton infererence server example 'simple_grpc_infer_client.py' DeepStream SDK	11	5018	March 23, 2022
Failed to deploy the reference server. Make an inference request to the peoplenet model via http TensorRT cudnn , inference-server-triton , deepstream	1	21	August 29, 2024
Unable to run Triton example TensorRT inference-server-triton	1	887	May 31, 2024
Triton inference server is sending back "HTTP/1.1 400 Bad Request" TAO Toolkit	6	3412	October 12, 2021
Inferencing on DINO in triton inference server TensorRT inference-server-triton	1	62	August 29, 2024
Triton Inference Server not supporting PyTorch v1.6? DeepStream SDK pytorch , inference-server-triton	13	2253	October 12, 2021
Triton Inference Engine Tensorflow Model Configuration expects 2 inputs, model provides 1 DeepStream SDK inference-server-triton , inception	9	3702	September 19, 2022
Utilizing Inference server for multi-batch processing with deepstream DeepStream SDK gstreamer , inference-server-triton , deepstream61	11	1140	October 19, 2023
Must specify a non-zero or non-empty correlation ID - Triton with sequence batching, Tensorrt DeepStream SDK tensorrt , nvbugs , inference-server-triton , deepstream	23	1394	January 19, 2023
Triton server for squad model on P100 with TensorRT 6.0 Triton Inference Server - archived	0	891	June 23, 2020

Triton Inference Server's health status shows 'Connection peer reset'

Description

Environment

Steps To Reproduce

Related topics