Failed to deploy the reference server. Make an inference request to the peoplenet model via http

Description

A clear and concise description of the bug or issue.

Environment

TensorRT Version:
GPU Type: 1650
Nvidia Driver Version: NVIDIA-SMI 535.183.06
CUDA Version: CUDA Version: 12.2
CUDNN Version:
Operating System + Version: ubuntu22.04
Python Version (if applicable):
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag):

Relevant Files

Please attach or include links to any models, data, files, or scripts necessary to reproduce your issue. (Github repo, Google Drive, Dropbox, etc.)

Steps To Reproduce

Please include:

  • Exact steps/commands to build your repro
  • Exact steps/commands to run your repro
  • Full traceback of errors encountered

I deployed a Triton server.
I built and uploaded the peoplenet model.
But I want to get peopleNet information using Triton client, but I have a problem.
What should I do?
I need an example.

This is the information of the Triton server.

I0828 08:14:49.458910 2675 server.cc:631]
±---------±----------------------------------------------------------±--------------------------------------------------------------------------------------------------------------------------------+
| Backend | Path | Config |
±---------±----------------------------------------------------------±--------------------------------------------------------------------------------------------------------------------------------+
| pytorch | /opt/tritonserver/backends/pytorch/libtriton_pytorch.so | {} |
| tensorrt | /opt/tritonserver/backends/tensorrt/libtriton_tensorrt.so | {“cmdline”:{“auto-complete-config”:“true”,“backend-directory”:“/opt/tritonserver/backends”,“min-compute-capability”:“6.000000”, |
| | | “default-max-batch-size”:“4”}} |
±---------±----------------------------------------------------------±--------------------------------------------------------------------------------------------------------------------------------+

I0828 08:14:49.458935 2675 server.cc:674]
±----------±--------±-------+
| Model | Version | Status |
±----------±--------±-------+
| peopleNet | 1 | READY |
±----------±--------±-------+

I0828 08:14:49.487243 2675 metrics.cc:810] Collecting metrics for GPU 0: NVIDIA GeForce GTX 1650
I0828 08:14:49.487381 2675 metrics.cc:703] Collecting CPU metrics
I0828 08:14:49.487535 2675 tritonserver.cc:2435]
±---------------------------------±---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Option | Value |
±---------------------------------±---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| server_id | triton |
| server_version | 2.37.0 |
| server_extensions | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tens |
| | or_data parameters statistics trace logging |
| model_repository_path[0] | /opt/nvidia/deepstream/deepstream-6.4/samples/configs/tao_pretrained_models/triton |
| model_control_mode | MODE_NONE |
| strict_model_config | 0 |
| rate_limit | OFF |
| pinned_memory_pool_byte_size | 268435456 |
| cuda_memory_pool_byte_size{0} | 67108864 |
| min_supported_compute_capability | 6.0 |
| strict_readiness | 1 |
| exit_timeout | 30 |
| cache_enabled | 0 |
±---------------------------------±---------------------------------------------------------------------------------------------------------------------------------------------------------------------+

I0828 08:14:49.488479 2675 grpc_server.cc:2451] Started GRPCInferenceService at 0.0.0.0:8001
I0828 08:14:49.488627 2675 http_server.cc:3558] Started HTTPService at 0.0.0.0:8000
I0828 08:14:49.529903 2675 http_server.cc:187] Started Metrics Service at 0.0.0.0:8002

===========================================================================

arstest@arstest:/media/arstest/F567-A30F/triton/opencv$ curl -v http:/192.168.50.20:8000/v2

  • Trying 192.168.50.20:8000…
  • Connected to 192.168.50.20 (192.168.50.20) port 8000 (#0)

GET /v2 HTTP/1.1
Host: 192.168.50.20:8000
User-Agent: curl/7.81.0
Accept: /

  • Mark bundle as not supporting multiuse
    < HTTP/1.1 200 OK
    < Content-Type: application/json
    < Content-Length: 285
    <
  • Connection #0 to host 192.168.50.20 left intact
    {“name”:“triton”,“version”:“2.37.0”,“extensions”:[“classification”,“sequence”,“model_repository”,“model_repository(unload_dependents)”,“schedule_policy”,“model_configuration”,“system_shared_memory”,“cuda_shared_memory”,“binary_tensor_data”,“parameters”,“statistics”,“trace”,“logging”]}

I read this information.

Output:

Output Type(s): Label(s), Bounding-Box(es), Confidence Scores
Output Format: Label: Text String(s); Bounding Box: (x-coordinate, y-coordinate, width, height), Confidence Scores: Floating Point
Other Properties Related to Output: Category Label(s): Bag, Face, Person, Bounding Box Coordinates, Confidence Scores

Hi @wotkddlek2 ,
Please reach out to the below link to get the help for Triton related issues.

Thanks