How to add triton server to deepstream in different device?

zongxp1 · January 19, 2021, 2:43am

Please provide complete information as applicable to your setup.

• Hardware Platform :Jetson xavier
• DeepStream Version:5.0.1
• JetPack Version :4.4
• TensorRT Version:7.1
• Issue Type:new requirements

Hi,

I have achieved nvinferserver plugin function in deepstream with deepstream-app, such as deepstream-app -c source1_primary_detector_nano.txt, and can run it succeed.

However, I want deploy triton server in dGPU and deploy deepstream in jetson, and all pipeline get infer result from triton server, including preprocess and postprocess.

And another solution is start different pipeline and share one nvinferserver in jetson or dGPU, as I observed that there are different server when I start different pipeline, it cost lots resources.

Can you tell me how to achieve it? Or whether it is possible. Thanks!

mchi · January 19, 2021, 3:44pm

DeepStream Triton only supports to run on local machine, that is, both Triton client and sereer are in one DS instance.

But DS supports nvmsgbroker to communicate with server, you could take a look if it can work for you.

Thanks!

zongxp · January 20, 2021, 2:29am

Hi, mchi,

nvmsgbroker is not my idea choice, as I want the infer cap of server and deepstream pipeline in jetson.

So if triton only support run on local machine, I wander whether multiple pipeline can use only one triton server?

Thanks !

mchi · January 21, 2021, 3:01pm

In DeepStream, it’s not supported.

zongxp · January 22, 2021, 1:48am

OK, thanks !

eenav · September 29, 2021, 9:26am

Hi,
You can always integrate Triton Client in Deepstream,
Comes as python/cpp ,
Can integrate cpp client in Deepstream dsexample plugin, or python client in python probe (Deepstream python bindings)

Thanks,
Eyal

dilip.patel · September 30, 2021, 8:02am

Hello Eyal,

I am looking for similar kind of solution and I have referred triton client example with grpc but those are related to single images only, How can do it for videos? Can you please share any reference example which we can follow to achieve same?

Do I need to consider video as series of images and need call triton infer server with every frame?
How Do you recommend if I need to Deeptream pipeline on same host as triton server?

eenav · October 3, 2021, 7:53am

Hi,
Yes, efficient decode via deepstream and send each image to inference via cuda shared mem triton api for best performance on same machine / grpc for remote server.

Thanks

dilip.patel · October 4, 2021, 4:19am

Thanks Eyal.

Does Triton Server supports multi stage inference and Tracker in single request to Triton Inference Server or Do we need to make separate request for each stage inference? Does Triton Inference Server help tracking object if we have pipeline like below.

Decode → Inference stage 1 → Tracking → Inference stage 2 → OSD → Display

Thanks,
Dilip Patel

winston_huang · October 14, 2022, 3:05am

I suppose you can use Model pipelines using Ensembling or Business Logic Scripting (BLS)