Triton Inference Server's health status shows 'Connection peer reset'

Hi,

Description

Facing error while connecting to Triton inference server(seems server startup is having errors)

Environment

• Hardware Platform (GPU): NVIDIA 2080 TI
• DeepStream Version: 5.0
• NVIDIA GPU Driver Version : 450.102.04
• Issue Type: question

Steps To Reproduce

I am trying to use Triton Inference Server for image classification. The docker image that I have tried are:

  1. nvcr.io/nvidia/deepstream:5.0-dp-20.04-triton
  2. nvcr.io/nvidia/deepstream:5.0.1-20.09-triton

I have run the server using following command:
sudo docker run --gpus all -it -d -p8000:8000 -p8001:8001 -p8002:8002 -v:/models tritonserver --model-repository=/models

Now, when I try to check the health status of the running Triton server using:
curl -v localhost:8000/v2/health/ready

It connects for a moment and then give: connection reset error:

  • Trying 127.0.0.1…
  • TCP_NODELAY set
  • Connected to localhost (127.0.0.1) port 8000 (#0)

GET /v2/health/ready HTTP/1.1
Host: localhost:8000
User-Agent: curl/7.58.0
Accept: /

  • Recv failure: Connection reset by peer
  • stopped the pause stream!
  • Closing connection 0
    curl: (56) Recv failure: Connection reset by peer

I have also tried to run classification on image on triton-inference-server using triton-client setup outside the triton container.

python3 python/image_client.py -m inception_graphdef -s INCEPTION qa/images/mug.jpg

But it gives following traceback
Traceback (most recent call last):
File “/home/ubuntu/.local/lib/python3.6/site-packages/geventhttpclient/response.py”, line 190, in _read_headers
data = self._sock.recv(self.block_size)
File “/home/ubuntu/.local/lib/python3.6/site-packages/gevent/_socketcommon.py”, line 657, in recv
return self._sock.recv(*args)
ConnectionResetError: [Errno 104] Connection reset by peer

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “./python/image_client.py”, line 401, in
model_name=FLAGS.model_name, model_version=FLAGS.model_version)
File “/home/ubuntu/.local/lib/python3.6/site-packages/tritonclient/http/init.py”, line 494, in get_model_metadata
query_params=query_params)
File “/home/ubuntu/.local/lib/python3.6/site-packages/tritonclient/http/init.py”, line 258, in _get
response = self._client_stub.get(request_uri)
File “/home/ubuntu/.local/lib/python3.6/site-packages/geventhttpclient/client.py”, line 266, in get
return self.request(METHOD_GET, request_uri, headers=headers)
File “/home/ubuntu/.local/lib/python3.6/site-packages/geventhttpclient/client.py”, line 260, in request
raise e
File “/home/ubuntu/.local/lib/python3.6/site-packages/geventhttpclient/client.py”, line 254, in request
block_size=self.block_size, method=method.upper(), headers_type=self.headers_type)
File “/home/ubuntu/.local/lib/python3.6/site-packages/geventhttpclient/response.py”, line 298, in init
super(HTTPSocketPoolResponse, self).init(sock, **kw)
File “/home/ubuntu/.local/lib/python3.6/site-packages/geventhttpclient/response.py”, line 170, in init
self._read_headers()
File “/home/ubuntu/.local/lib/python3.6/site-packages/geventhttpclient/response.py”, line 204, in _read_headers
‘connection closed.’)
geventhttpclient.response.HTTPConnectionClosed: connection closed.

Have tried latest and older version of triton-inference-server, in both it is giving the same error.

Looking out for some hint what is going wrong with the setup. In case any other information needed let me know.

Hi, Request you to share your model and script, so that we can help you better.

Alternatively, you can try running your model with trtexec command.
https://github.com/NVIDIA/TensorRT/tree/master/samples/opensource/trtexec

Thanks!

Hi,
Just to start with I am using the sample models provided inside triton-inference-server docker container present at:
/opt/nvidia/deepstream/deepstream-5.0/samples/models

which includes following models:
Primary_Detector/
Secondary_CarColor/
Secondary_CarMake/
Secondary_VehicleTypes/
Segmentation_Industrial/
Segmentation_Semantic/
densenet_onnx/
inception_graphdef/
mobilenet_v1/
ssd_inception_v2_coco_2018_01_28/
ssd_mobilenet_v1_coco_2018_01_28/

I have arranged the models in the same hierarchy as defined in the docs

Hi,

As the deepstream+triton server docker was not running properly, We tried running the Triton Inference Server docker image without Deepstream-5.0, using following command.
docker pull nvcr.io/nvidia/tritonserver:20.03.1-py3

And this time used triton-client sdk docker image to send inference request. Used following client image:
docker pull nvcr.io/nvidia/tritonserver:20.-py3-sdk

With this the model loaded successfully and was able to run the sample models and successfully ran inference on sample models. But when we added our classification model with two classes (trained using TensorRT with Resnet18 as meta architecture) it loaded successfully.

When I tried running classification on it, it was giving 50% precision for both classes on major number of images and for the one it has given higher precision is wrong result:
Batch size change krne pr :
Request 13, batch size 5
Image ‘/workspace/images/testing_img//9ch2_2.jpg’:
1 (OTHERS) = 0.995663
0 (CLEANING) = 0.0043374
Image ‘/workspace/images/testing_img//9ch3_2.jpg’:
0 (CLEANING) = 0.5
1 (OTHERS) = 0.5
Image ‘/workspace/images/testing_img//10ch2_2.jpg’:
0 (CLEANING) = 0.5
1 (OTHERS) = 0.5
Image ‘/workspace/images/testing_img//10ch3_2.jpg’:
0 (CLEANING) = 0.5
1 (OTHERS) = 0.5
Image ‘/workspace/images/testing_img//11ch2_2.jpg’:
0 (CLEANING) = 0.5
1 (OTHERS) = 0.5

Model config is attached config.pbtxt (442 Bytes)

Please let me know if there is some gaps in config file or any other settings as the same model when tested using deepstream is working correctly.

Thanks

Hi @sheetal.vishwakarma,

This doesn’t look like a trt issue. Please post your query in related forum.

Thank you.

Moved

Gentle Reminder!