CUDA shared memory registration failed when requesting recognition from deepstream to an external triton server. to occur

dbdnjswns2 · April 12, 2024, 5:39am

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU) : GPU
• DeepStream Version : 6.1
• JetPack Version (valid for Jetson only)
• TensorRT Version : 8.4.0.11
• NVIDIA GPU Driver Version (valid for GPU only) : 525.105.17
• Issue Type( questions, new requirements, bugs)
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)

The execution environment is as follows

triton server version is 23.10
Deepstream sends an inference request to the triton server docker run separately.
For deepstream’s config.pbtxt, set enable_cuda_buffer_sharing:true
When deepstream makes one inference request to one GPU, it executes normally.
When multiple deep stream inference requests are made in deep stream, an error like number 5 occurs, but over time, it stabilizes and multiple deep streams run normally.
ERROR: infer_grpc_client.cpp:223 Failed to register CUDA shared memory.
ERROR: infer_grpc_client.cpp:311 Failed to set inference input: failed to register CUDA shared memory region ‘inbuf_0x2be8300’: failed to open CUDA IPC handle: invalid argument
ERROR: infer_grpc_backend.cpp:140 gRPC backend run failed to create request for model: yolov8_pose
ERROR: infer_trtis_backend.cpp:350 failed to specify dims when running inference on model:yolov8_pose, nvinfer error:NVDSINFER_TRITON_ERROR
I want to prevent 5 errors when making multiple inference requests.

• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)

I want to prevent 5 errors when making multiple inference requests.
I can see from the document that enable_cuda_buffer_sharing:true is valid on the triton server in the deepstream docker container, but I confirmed that it operates normally over time even when running it on an external triton server. Please tell me how to prevent the above error from occurring.

fanzh · April 13, 2024, 3:24pm

This “enable_cuda_buffer_sharing” feature should be enabled only when the Triton server is on the same machine. are the client and tritonserver on the same machine? if yes, could you use deepstream-test1, which support nvinferserver, to reproduce this issue? Thanks!

dbdnjswns2 · April 15, 2024, 6:44am

Are you saying that you need to have a triton server inside the deepstream docker container when using deepstream docker?
If correct, as I wrote in the text, the triton server was run as a separate docker container.
As a result, a total of two docker containers, deepstream and triton server, are being used.
I know that the enable_cuda_buffer_sharing function works without problems when there is a triton server in the same deepstream.
However, as in my case, there was no triton server in the deepstream docker container, and when I ran the external triton server docker container, it took time to stabilize, but it worked in the end.
I’m getting ERROR: infer_grpc_client.cpp:223 Failed to register CUDA shared memory. I would like to resolve the error.

fanzh · April 15, 2024, 3:39pm

Thanks for the sharing! seems the DeepSteram client and triton server are in the same machine but in the different docker container. I will try this “Failed to register CUDA shared” error.
what do you mean about “makes one inference request to one GPU” and “When multiple deep stream inference requests are made in deep stream”? how to reproduce these steps?

fanzh · April 16, 2024, 6:36am

please start the docker with “–ipc host”, please refer to this topic. On one machine, I started two deepstream:6.4-triton-multiarch as the client and server respectively. deepstream-app run well with enable_cuda_buffer_sharing=true.
if still encounter " Failed to register CUDA shared memory", please use deepstream-app to reproduce this issue and share the detailed reproducing steps.

dbdnjswns2 · April 23, 2024, 1:13am

I found a solution.
The way I proceeded was to run multiple deepstreams through multiprocess.
However, this method seems to have a problem because multiple processes with the same parent pid try to use shared memory at the same time.
After changing the part to subprocess and executing it, it operates as a separate process and the cuda shared memory error seems to have been resolved.
Thank you for your interest in the content.

system · May 7, 2024, 1:14am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Deepstream - Failed to register CUDA shared memory DeepStream SDK	3	297	December 25, 2023
DeepStream Triton gRPC example does not run with Deepstream Triton Docker images DeepStream SDK	12	1071	January 17, 2023
"nvdsinferserver.config.TritonGrpcParams" has no field named "enable_cuda_buffer_sharing" DeepStream SDK	6	330	June 30, 2023
Error when using Triton Server for Inference on deepstream-imagedata-example DeepStream SDK	21	1740	October 12, 2021
Deepstream Triton Docker container cannot run with MPS DeepStream SDK	4	880	January 17, 2023
[error] when DeepsTream`s container using Triton Inference Server through gRPC,Segmentation fault (core dumped) DeepStream SDK	11	1068	March 9, 2022
Failed to run deepstream-test5-app in docker DeepStream SDK docker	8	1358	October 12, 2021
Error when using ensemble model with deepstream-5.1 : failed to get input buffer in CPU memory DeepStream SDK inference-server-triton	7	1191	September 4, 2021
Triton server logs DeepStream SDK	7	5045	May 16, 2022
Avoid memory copy for deepstream pipeline connecting to a standalone local triton inference server DeepStream SDK docker , inference-server-triton , gpu , grpc , deepstream	2	365	April 1, 2024

CUDA shared memory registration failed when requesting recognition from deepstream to an external triton server. to occur

Related topics