GPU support with Triton iGPU image and Python Backend

Hi, I am trying to run the triton image with my model in the Python Backend.

I am using this image

http://nvcr.io/nvidia/tritonserver:24.06-py3-igpu

When I set the instance group to KIND_CPU it runs fine although it runs with CPU. When I set it to KIND_GPU the triton server crashes with the following error message:

I0904 21:12:33.899132 1 python_be.cc:1912] "TRITONBACKEND_ModelInstanceInitialize: my_model (GPU device 0)"
I0904 21:12:33.899318 1 python_be.cc:2050] "TRITONBACKEND_ModelInstanceFinalize: delete instance state"
E0904 21:12:33.899394 1 backend_model.cc:692] "ERROR: Failed to create instance: GPU instances not supported"
I0904 21:12:33.899425 1 python_be.cc:1891] "TRITONBACKEND_ModelFinalize: delete model state"
E0904 21:12:33.899491 1 model_lifecycle.cc:641] "failed to load 'my_model' version 1: Internal: GPU instances not supported"

Looking at the support page, it seems like GPU Tensors is not supported in JetPack 5.0. Does that mean I cannot use GPU when using pb_utils.Tensor?

import triton_python_backend_utils as pb_utils
pb_utils.Tensor <- Not supported?

I am running this on Jetson Orin Nano 8GB with Jetpack 6 so I am unsure if the information on the support page is still valid.

Thank you!

Hi,

Thanks for your report.
We need to check with our internal team for the latest status and provide more info to you later.

Thanks.

Hi AastaLLL,

Thank you for your help! Do you receive any updates from your internal team on this matter?

Thanks!

Hi,

Not yet.
Will let you know once we got a feedback.

Thanks.

Hi AastaLLL,

I appreciate your help on this matter!

Just to give more context, I also tried to run the AddSubNet in PyTorch example provided in the python_backend repo and I am also seeing this error when setting instance_group [{ kind: KIND_GPU }]. Setting this configuration to KIND_CPU worked but no GPU acceleration.

Here is part of the error log triton generates:

I0910 23:40:49.047496 557 pinned_memory_manager.cc:277] "Pinned memory pool is created at '0x203eae000' with size 268435456"
I0910 23:40:49.047770 557 cuda_memory_manager.cc:107] "CUDA memory pool is created on device 0 with size 67108864"
I0910 23:40:49.053250 557 model_lifecycle.cc:472] "loading: pytorch:1"
I0910 23:40:53.845067 557 python_be.cc:1912] "TRITONBACKEND_ModelInstanceInitialize: pytorch_0_0 (GPU device 0)"
E0910 23:40:53.846082 557 backend_model.cc:692] "ERROR: Failed to create instance: GPU instances not supported"
E0910 23:40:53.846415 557 model_lifecycle.cc:641] "failed to load 'pytorch' version 1: Internal: GPU instances not supported"
I0910 23:40:53.846516 557 model_lifecycle.cc:776] "failed to load 'pytorch'"
I0910 23:40:53.847145 557 server.cc:604]
+------------------+------+
| Repository Agent | Path |
+------------------+------+
+------------------+------+

I0910 23:40:53.847323 557 server.cc:631]
+---------+-------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Backend | Path                                                  | Config                                                                                                                                                        |
+---------+-------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------+
| python  | /opt/tritonserver/backends/python/libtriton_python.so | {"cmdline":{"auto-complete-config":"true","backend-directory":"/opt/tritonserver/backends","min-compute-capability":"5.300000","default-max-batch-size":"4"}} |
+---------+-------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------+

I0910 23:40:53.847541 557 server.cc:674]
+---------+---------+----------------------------------------------------+
| Model   | Version | Status                                             |
+---------+---------+----------------------------------------------------+
| pytorch | 1       | UNAVAILABLE: Internal: GPU instances not supported |
+---------+---------+----------------------------------------------------+

I am using the same triton image

http://nvcr.io/nvidia/tritonserver:24.06-py3-igpu

Here is the output from nvidia-smi and nvcc -V

root@4fedf3f:/app/python_backend# nvidia-smi
Tue Sep 10 23:53:40 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 540.3.0                Driver Version: N/A          CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  Orin (nvgpu)                  N/A  | N/A              N/A |                  N/A |
| N/A   N/A  N/A               N/A /  N/A | Not Supported        |     N/A          N/A |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+
root@4fedf3f:/app/python_backend# nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Tue_Aug_15_22:08:11_PDT_2023
Cuda compilation tools, release 12.2, V12.2.140
Build cuda_12.2.r12.2/compiler.33191640_0

Thanks again for looking into this!

Hi,

Unfortunately, this is not supported.

Triton is unable to enable the GPU models for the Python backend because the Python backend communicates with the GPU using non-supported IPC CUDA Driver API.

We will update this information in the document accordingly.

Thanks.

Thanks for the response, @AastaLLL! Could you clarify exactly what combination is not supported? Is it the combination of (python backend) x (GPU) that is not supported? Or only (python backend) x (GPU) x (Jetson-ARM device)? Or is it just tritonserver:24.06-py3-igpu?

Hi,

The issue is related to the Jetson environment.

The Triton is unable to enable the GPU models for the python backend because the python backend communicates with the GPU using a legacy IPC CUDA Driver API, which is not working on Jetson.

Thanks.

Hi,

Any progress here?
Found myself with same error and similar setup.
My case is an ensemble model that runs on GPU and pass tensors to a python backend model.
So, opened an issue on triton server repo 2 days ago…

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.