Use multiple CUDA streams with multiple TensorRT models

Hi, all.
Is it faster to try multiple CUDA streams with multiple tensorrt models than to combine multiple tensorrt models into one?

Hi,

It depends on the use case.
If your model doesn’t occupy all GPU resources at once, multiple CUDA streams can benefit from parallel inference.

Thanks

1 Like

Your answer is very helpful to me. Thanks.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.