Latency when running TensorRT engine on two GPU

Hi,

Enqueue operation is CPU call not related to GPU stream.
Context 1 execution start at the same time code calls first enqueue and same is the case with context 2.
If user really want to start inference at the same time, starting a new thread is always the choice.
Please refer to below links in case it helps:

Alternatively, you can use Deepstream to run multiple models.

Thanks