Deepstream Python Triton model - share tensors with zero copy "upstream"

guydada · March 14, 2023, 9:36am

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU) GPU
• DeepStream Version 6.2 Triton
• TensorRT Version 8.5.2
• NVIDIA GPU Driver Version (valid for GPU only) Driver Version: 525.85.05 CUDA Version: 12.0
• Issue Type( questions, new requirements, bugs) Question
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)

I am using a Deepstream Triton deployment that includes:

Python preprocessor
TensorRT Model
Python postprocessor

I am looking for a way to share tensors with zero copy mechanism as supported by Triton - to share tensors between the postprocessor and preprocessor in runtime, “Upstream” (To be clear, specifically I need to share from the postprocessor to the preprocessor, so it cannot just be an extra input from the preprocessor) . Is there any API or interface I can call for that?
I know for wure that it happens “behind the scenes” for Triton because it supports zero copy.

Thanks again
Guy

Guy

fanzh · March 14, 2023, 2:42pm

what is the model used to do?
you might use deepstream nvinferserver to do inference, nvinferserver supports to do preprocess , inference, postprocess, you only need to modify the configuration file. about “share from the postprocessor to the preprocessor”, you might use nvinferserserver’s IInferCustomProcessor inferface, which can support “User can process last frame’s output tensor from inferenceDone() and feed into next frame’s inference input tensor in extraInputProcess()”, please refer to deepstream sample: opt\nvidia\deepstream\deepstream\sources\TritonOnnxYolo\nvdsinferserver_custom_impl_yolo\nvdsinferserver_custom_process_yolo.cpp

guydada · March 14, 2023, 2:59pm

Hi ! It’s probably my fault here for not mentioniong, this is all on nviferserver with Triton, I need to send some tensors upstream for my use case. My model uses a history vector on dynamic batchs to identify certain events on cameras

fanzh · March 15, 2023, 7:37am

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

about “I need to share from the postprocessor to the preprocessor”, do you mean you will feed last frame’s output tensor into next frame’s inference input tensor? if yes, please refer to nvinferserserver’s IInferCustomProcessor inferface mentioned in my previous comments, here is a better smaple opt\nvidia\deepstream\deepstream\sources\objectDetector_FasterRCNN\nvdsinfer_custom_impl_fasterRCNN\nvdsinferserver_custom_process.cpp, it will show how to get output and feedback to input in function inferenceDone. here is the doc: doc
do you want to use nvinferserver to do preprocess(C code, need to modify configuration file) or use triton do python preprocess? as you know, nvinferserver leverages tirton to do inference, python preprocess and postprocsss can be encapsulated into a model, here is doc: doc, here is a sample: preprocess_py
could you elaborate “share tensors with zero copy mechanism as supported by Triton”？ nvifnerserver has two modes, one is native, the other is GRPC mode, here is the doc: doc, if using GRPC mode, enable_cuda_buffer_sharing can share CUDA buffers, please refer to doc

system · April 11, 2023, 5:47am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Does triton inference server: python backend with decoupled mode works with nvinferserver DeepStream SDK inference-server-triton , deepstream	9	500	May 29, 2024
Deepstream with triton DeepStream SDK	12	582	October 9, 2023
DeepStream Triton Server and Triton Client cannot be used together DeepStream SDK nvbugs , inference-server-triton , deepstream , deepstream61	11	178	October 30, 2024
Help with Triton ensembles and python on Jetson DeepStream SDK deepstream	10	71	June 12, 2025
Deepstream 6.0.1 on Jetson Xavier NX with Triton Python backend DeepStream SDK	4	292	January 20, 2023
Failed to use custom sequence preprocess in triton inferserver DeepStream SDK camera , docker , python , inference-server-triton	5	649	April 20, 2023
DeepStream Triton gRPC example does not run with Deepstream Triton Docker images DeepStream SDK	12	1177	January 17, 2023
Deepstream - Use standalone Triton server? DeepStream SDK	10	1342	October 12, 2021
Performance about nvinfer and nvinferserver DeepStream SDK	6	1408	March 22, 2022
Can I pass source-id from nvinferserver along with frame? DeepStream SDK inference-server-triton , deepstream	3	655	March 30, 2022

Deepstream Python Triton model - share tensors with zero copy "upstream"

Related topics