Triton did not update the model after users added a new model into model_repository | There is nothing on localhost:8000/api/status

Chieh · April 15, 2021, 3:06am

The issue was posted on the Github TRTIS repo already, but it hasn’t been solved. here

Description

Triton inference server of version 21.03 did not update the model after users added a new model into model_repository.

I have used the old Triton version (20.03) for a long time, and everything was working well. However, as old versions of TRTIS cannot support the GPU of Ampere arch; hence, I switched to a newer version of TRTIS (21.03).

I could successfully launch the TRTIS, and I also saw the status of output which showed “READY” as below:

I0415 02:43:44.019434 1 server.cc:570] 
+-------------------------------+---------+--------+
| Model                         | Version | Status |
+-------------------------------+---------+--------+
| densenet_onnx                 | 1       | READY  |
+-------------------------------+---------+--------+

I checked this command as well:

$ curl -v localhost:8000/v2/health/ready
*   Trying 127.0.0.1:8000...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 8000 (#0)
> GET /v2/health/ready HTTP/1.1
> Host: localhost:8000
> User-Agent: curl/7.68.0
> Accept: */*
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Content-Length: 0
< Content-Type: text/plain
< 
* Connection #0 to host localhost left intact

The weird thing is that there was nothing by this command below:

$curl localhost:8000/api/status

I think normally it should be print the information of model, no?
Whereas, I cannot see any information.

In addition, in order to test whether Triton can update the model immediately, so I move the model folder outside of model_repository (Or move back to the model_repository), but it did not happen anything. In general, it should print something like “unload” or “Load” the model information on the terminal.

Although I cannot check the model information via localhost:8000/api/status, I can implement the example and get the results that it can prove the model which can load correctly initially and server works)

Example output:

$ /workspace/install/bin/image_client -m densenet_onnx -c 3 -s INCEPTION /workspace/images/mug.jpg
Request 0, batch size 1
Image '/workspace/images/mug.jpg':
    15.349566 (504) = COFFEE MUG
    13.227467 (968) = CUP
    10.424896 (505) = COFFEEPOT

Triton Information

My tirton version was using 21.03 and this was pulled from NGC.

To Reproduce

docker run --rm --gpus all \
     --shm-size=1g \
    --ulimit memlock=-1 \
    --ulimit stack=67108864 \
    -p 8000:8000 -p 8001:8001 -p 8002:8002 \
    --name trt_serving2103_server \
    -v /hoem/model_repository:/models \
    nvcr.io/nvidia/tritonserver:21.03-py3 \
    tritonserver --model-repository=/models

model was using densenet_onnx which was downloaded from example.

Expected behavior

I can check the information via localhost:8000/api/status
It can change or update the model without closing the server. (This is very important feature for TRTIS.)

Chieh · April 19, 2021, 2:31am

Resolved the problem of updating the model without closing the server:

--model-control-mode="poll"

However, the problem of using localhost:8000/api/status still cannot work.

Chieh · April 22, 2021, 8:39am

Example:

curl localhost:8000/v2/models/mnist/versions/1/stats

mnist is your model folder name.

But this one is not able to preview all models’ status directly.

Topic		Replies	Views
Model shown READY but can't be found General inference-server-triton	16	2451	November 15, 2022
Triton Inference Server's health status shows 'Connection peer reset' Triton Inference Server - archived inference-server-triton	6	6355	January 18, 2021
Triton infererence server example 'simple_grpc_infer_client.py' DeepStream SDK	11	5018	March 23, 2022
Triton server inference model placement TAO Toolkit	7	961	February 23, 2022
Triton inference server is sending back "HTTP/1.1 400 Bad Request" TAO Toolkit	6	3411	October 12, 2021
Triton Server Crashing Running Centerpoint Keypoint (hourglass_512x512_kpts) on Jetson via Dockerized Triton Jetson TX2 jetson-inference , docker , inference-server-triton	6	1169	February 9, 2022
ModuleNotFoundError: No module named 'tensorrtserver' TAO Toolkit	10	1718	October 12, 2021
Unable to run Triton example TensorRT inference-server-triton	1	886	May 31, 2024
Error when using ensemble model with deepstream-5.1 : failed to get input buffer in CPU memory DeepStream SDK inference-server-triton	7	1202	September 4, 2021
Failed to deploy the reference server. Make an inference request to the peoplenet model via http TensorRT cudnn , inference-server-triton , deepstream	1	21	August 29, 2024

Triton did not update the model after users added a new model into model_repository | There is nothing on localhost:8000/api/status

Description

Triton Information

To Reproduce

Expected behavior

Reference:

Related topics