How to use input_consumed in execute_async

execute_async(self: tensorrt.tensorrt.IExecutionContext, batch_size: int = 1, 
    bindings: List[int], stream_handle: int, input_consumed: capsule = None) → bool

I noticed in the TensorRT 7 Python API, the execute_async function has an optional argument input_consumed. The doc doesn’t clarify how to use this argument. I want to utilize it so that I can safely copy new input to the page locked host buffer for my next inference as soon as possible. What is this capsule type exactly?

Hi @alexis.yang
We do not have a sample around this.
input_consumed consumes a cudaEvent and the Event is used to sync the input.
Please check the below link for reference.

Thanks for your response. Are you referring to the Cuda Event here :
Can I pass in a pycuda.driver.Event as the input_consumed argument? Can I assume calling Event.synchronize() afterwards will block until the input buffer has been consumed?

Hi @alexis.yang,

Could you please let us know if you are still facing this issue?


Sorry, I ended up not using that parameter. But I think the link I posted should point people to the right direction if they face similar issue.