Based on the image:
After the inference (ExecutionContext::enqueue), there are cudaEventRecord and cudaEventSynchronize.
I would like to ask about what is the total execution time of inference. Does it include the time of cudaEventRecord and cudaEventSynchronize or only time of (ExecutionContext::enqueue) ?