Failed to deploy the reference server. Make an inference request to the peoplenet model via http

wotkddlek2 · August 28, 2024, 9:05am

Description

A clear and concise description of the bug or issue.

Environment

TensorRT Version:
GPU Type: 1650
Nvidia Driver Version: NVIDIA-SMI 535.183.06
CUDA Version: CUDA Version: 12.2
CUDNN Version:
Operating System + Version: ubuntu22.04
Python Version (if applicable):
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag):

Relevant Files

Please attach or include links to any models, data, files, or scripts necessary to reproduce your issue. (Github repo, Google Drive, Dropbox, etc.)

Steps To Reproduce

Please include:

Exact steps/commands to build your repro
Exact steps/commands to run your repro
Full traceback of errors encountered

I deployed a Triton server.
I built and uploaded the peoplenet model.
But I want to get peopleNet information using Triton client, but I have a problem.
What should I do?
I need an example.

This is the information of the Triton server.

I0828 08:14:49.458910 2675 server.cc:631]
±---------±----------------------------------------------------------±--------------------------------------------------------------------------------------------------------------------------------+
| Backend | Path | Config |
±---------±----------------------------------------------------------±--------------------------------------------------------------------------------------------------------------------------------+
| pytorch | /opt/tritonserver/backends/pytorch/libtriton_pytorch.so | {} |
| tensorrt | /opt/tritonserver/backends/tensorrt/libtriton_tensorrt.so | {“cmdline”:{“auto-complete-config”:“true”,“backend-directory”:“/opt/tritonserver/backends”,“min-compute-capability”:“6.000000”, |
| | | “default-max-batch-size”:“4”}} |
±---------±----------------------------------------------------------±--------------------------------------------------------------------------------------------------------------------------------+

I0828 08:14:49.487243 2675 metrics.cc:810] Collecting metrics for GPU 0: NVIDIA GeForce GTX 1650
I0828 08:14:49.487381 2675 metrics.cc:703] Collecting CPU metrics
I0828 08:14:49.487535 2675 tritonserver.cc:2435]
±---------------------------------±---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Option | Value |
±---------------------------------±---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| server_id | triton |
| server_version | 2.37.0 |
| server_extensions | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tens |
| | or_data parameters statistics trace logging |
| model_repository_path[0] | /opt/nvidia/deepstream/deepstream-6.4/samples/configs/tao_pretrained_models/triton |
| model_control_mode | MODE_NONE |
| strict_model_config | 0 |
| rate_limit | OFF |
| pinned_memory_pool_byte_size | 268435456 |
| cuda_memory_pool_byte_size{0} | 67108864 |
| min_supported_compute_capability | 6.0 |
| strict_readiness | 1 |
| exit_timeout | 30 |
| cache_enabled | 0 |
±---------------------------------±---------------------------------------------------------------------------------------------------------------------------------------------------------------------+

I0828 08:14:49.488479 2675 grpc_server.cc:2451] Started GRPCInferenceService at 0.0.0.0:8001
I0828 08:14:49.488627 2675 http_server.cc:3558] Started HTTPService at 0.0.0.0:8000
I0828 08:14:49.529903 2675 http_server.cc:187] Started Metrics Service at 0.0.0.0:8002

===========================================================================

arstest@arstest:/media/arstest/F567-A30F/triton/opencv$ curl -v http:/192.168.50.20:8000/v2

Trying 192.168.50.20:8000…
Connected to 192.168.50.20 (192.168.50.20) port 8000 (#0)

GET /v2 HTTP/1.1
Host: 192.168.50.20:8000
User-Agent: curl/7.81.0
Accept: /

Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Content-Type: application/json
< Content-Length: 285
<
Connection #0 to host 192.168.50.20 left intact
{“name”:“triton”,“version”:“2.37.0”,“extensions”:[“classification”,“sequence”,“model_repository”,“model_repository(unload_dependents)”,“schedule_policy”,“model_configuration”,“system_shared_memory”,“cuda_shared_memory”,“binary_tensor_data”,“parameters”,“statistics”,“trace”,“logging”]}

I read this information.

Output:

Output Type(s): Label(s), Bounding-Box(es), Confidence Scores
Output Format: Label: Text String(s); Bounding Box: (x-coordinate, y-coordinate, width, height), Confidence Scores: Floating Point
Other Properties Related to Output: Category Label(s): Bag, Face, Person, Bounding Box Coordinates, Confidence Scores

AakankshaS · August 29, 2024, 8:13pm

Hi @wotkddlek2 ,
Please reach out to the below link to get the help for Triton related issues.

Thanks

Topic		Replies	Views
Model has kind KIND_GPU but no GPUs are available TensorRT cudnn , inference-server-triton	2	39	September 30, 2024
Triton Inference Server's health status shows 'Connection peer reset' Triton Inference Server - archived inference-server-triton	6	6043	January 18, 2021
Serving Peoplenet model using Triton gRPC Inference Server and make calls to it from outside the container DeepStream SDK tensorrt , gstreamer , python , inference-server-triton , tao , deepstream	14	1006	February 2, 2023
Unable to run Triton example TensorRT inference-server-triton	1	689	May 31, 2024
Inferencing on DINO in triton inference server TensorRT inference-server-triton	1	41	August 29, 2024
Triton infererence server example 'simple_grpc_infer_client.py' DeepStream SDK	11	4756	March 23, 2022
Triton inference server is sending back "HTTP/1.1 400 Bad Request" TAO Toolkit	6	3265	October 12, 2021
Regarding when we execute triton server on jetson orin getting an error unable to load model DeepStream SDK cuda	19	402	July 30, 2024
Problem with accumulating gpu memory usage in tritonserver TensorRT cudnn , inference-server-triton , deepstream	0	39	September 3, 2024
TensorRT inference result of one image don't keep the same in high qps TensorRT tensorrt	1	594	June 29, 2022