Multi-gpu multi-model multi-threads

Q:multi-gpu, multi-thread, multi-model ?
A: Each ICudaEngine object is bound to a specific GPU when it is instantiated, either by the builder or on deserialization. To select the GPU, use cudaSetDevice() before calling the builder or deserializing the engine. Each IExecutionContext is bound to the same GPU as the engine from which it was created. When calling execute() or enqueue(), ensure that the thread is associated with the correct device by calling cudaSetDevice() if necessary.
A:call the enqueue() function of the execution contexts on different streams to allow them to run in parallel;

Hi @FreedomLiX ,
Seems this is a note/sharing on TensorRT, do you need support from DeepStream perspective?

yeah !


Can you describe your question for DeepStream?
Examples of using multiple models:

Binding model to different GPU can be achieved by gpu_id in nvinfer parameters Gst-nvinfer — DeepStream 6.2 Release documentation (

You can share your requirements/pipeline for further discussion.

不用deepstream, 如何实现模型并行推理?有没有示例代码?

Sorry but this forum is focusing on deepstream topics, you may rephrase your question and ask in TensorRT forum.


This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.