Running ds-triton pipeline with DeepStream and Triton C Api, triton model inference is stuck and the frame rate of deepstream drops to 0

• Hardware Platform (Jetson / GPU)
Tesla T4 GPU

• DeepStream Version
NGC: nvcr.io/nvidia/deepstream:6.1-triton

• JetPack Version (valid for Jetson only)

• TensorRT Version
8.2.5
• NVIDIA GPU Driver Version (valid for GPU only)
Driver Version: 515.48.07

• Issue Type( questions, new requirements, bugs)
bugs

• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)

• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)

My Deepstream pipeline uses Triton C Api(not the Gst-nvinferserver plugin) to call the secondary model inference asynchronously. When the entire pipeline is running, sometimes the frame rate of deepstream drops to 0 and the triton model inference is stuck meanwhile, which won’t recover again. I don’t know why, maybe there is a competition for GPU resources between ds and triton? Could somebody help me with this problem? Thank you!

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU)

• DeepStream Version

• JetPack Version (valid for Jetson only)

• TensorRT Version

• NVIDIA GPU Driver Version (valid for GPU only)

• Issue Type( questions, new requirements, bugs)

• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)

• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)

please refer to tritonserver sample server/simple.cc at main · triton-inference-server/server · GitHub
please refer to nvinfersever if needed: deepstream-infer-tensor-meta-test

Thanks for your reply! My Triton C Api has referred to server/simple.cc at main · triton-inference-server/server · GitHub . And I want to know how I shoud do next to debug this problem, but I have no idea now. Could you make some suggestions?

  1. dose model work ok by third part tool? is it stuck at beginning or after a while?
  2. please enable verbose log by TRITONSERVER_ServerOptionsSetLogVerbose, then analyze the logs and opensource Trtion code.

Yes, two models both work well in two separated processes with tensorrt engine,and it is stuck occasionally, sometimes at beginning, sometimes still work well after processing ten thousands of frames.

I enable verbose log, and find that tritonserver mostly stuck in two positions of tensorrt_backend(cudaEventSynchronize(), cudaStreamSynchronize()) . I wonder if deadlock occurs between ds(trt engine) and tritonserver(trt engine).

did triton native samples have this issue on your machine? as you know, triton is opensource, you can use gdb to analyze the stack.

Thank you! I’ll do it next.

Sorry for the late reply, Is this still an issue to support? Thanks

There is no update from you for a period, assuming this is not an issue any more.
Hence we are closing this topic. If need further support, please open a new one.
Thanks

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.