Enqueue Function not Asynchronous

Hi,

I have been working with Tensorrt for a while now. In my implementation, I use the enqueue function. I define the stream and make sure that the stream is not synchronized until the operation is complete. However, I find that the enqueue function does not behave in an asynchronous manner.

auto t_start = std::chrono::high_resolution_clock::now();
mTrtContext_uff->enqueue(batchSize, &mTrtCudaBuffer_uff[inputIndex], mTrtCudaStream_uff, NULL);
auto t_end = std::chrono::high_resolution_clock::now();
inference_time = std::chrono::duration<float, std::milli>(t_end - t_start).count();

I logged the inference_time and found the average timing to be 63.94ms. I would have expected the timing to be close to zero as the function enqueue should be asynchronous.

When I change the enqueue function to execute, like so:

auto t_start = std::chrono::high_resolution_clock::now();
mTrtContext_uff->execute(batchSize, &mTrtCudaBuffer_uff[inputIndex]);
auto t_end = std::chrono::high_resolution_clock::now();
inference_time = std::chrono::duration<float, std::milli>(t_end - t_start).count();

The average inference time logged is 56.29 ms. It runs even faster.

Any reason why?
The model used for this test was converted from a .pb file into a .uff file and used to generate a .engine for inference. To the best of my knowledge, there are no synchronization functions called in the model.

Environment

TensorRT Version: 7.0.0.11
GPU Type: GTX1060 (but I encounter the same problem with T4s)
Nvidia Driver Version: 26.21.14.4292
CUDA Version: 10.2
CUDNN Version: 7.6.5
Operating System + Version: Windows 10

Hi,

TensorRT execution is typically asynchronous.
If there is sync operation in plugin, the whole network would perform as sync.
I will also recommend you to try latest TRT release.

Thanks

1 Like

Hi SunilJB,

Thanks for the reply. I made a mistake with the TensorRT version. I’m actually using TensorRT-7.0.0.11. I’ve since updated my post. Sorry for the confusion.

I’ve checked the plugins and found no sync operations. I’m not even using any external plugins for this as none were required. If I had introduced a sync operator, where would this have been done? Model conversion (.pb -> .uff), creating engine (.uff -> .engine), serialization of model, actual enqueue operation?

Thanks.

Colin

Hi,

Can you share the model & script file to reproduce the issue so we can help better?

Thanks

Hi SunilJB,

Sorry for the delay in response. You were right. We did introduce some sync operation. Upon removing them, the enqueue function does became asynchronous. Thanks.