How to use input_consumed in execute_async

execute_async(self: tensorrt.tensorrt.IExecutionContext, batch_size: int = 1, 
    bindings: List[int], stream_handle: int, input_consumed: capsule = None) → bool

I noticed in the TensorRT 7 Python API, the execute_async function has an optional argument input_consumed. The doc doesn’t clarify how to use this argument. I want to utilize it so that I can safely copy new input to the page locked host buffer for my next inference as soon as possible. What is this capsule type exactly?

Hi @alexis.yang
We do not have a sample around this.
input_consumed consumes a cudaEvent and the Event is used to sync the input.
Please check the below link for reference.
https://docs.nvidia.com/deeplearning/tensorrt/api/python_api/infer/Core/ExecutionContext.html#tensorrt.IExecutionContext.execute_async
Thanks!

@AakankshaS
Thanks for your response. Are you referring to the Cuda Event here :
https://documen.tician.de/pycuda/driver.html?highlight=pagelocked_empty#pycuda.driver.Event
Can I pass in a pycuda.driver.Event as the input_consumed argument? Can I assume calling Event.synchronize() afterwards will block until the input buffer has been consumed?

Hi @alexis.yang,

Could you please let us know if you are still facing this issue?

Thanks

Sorry, I ended up not using that parameter. But I think the link I posted should point people to the right direction if they face similar issue.