execute_async(self: tensorrt.tensorrt.IExecutionContext, batch_size: int = 1, bindings: List[int], stream_handle: int, input_consumed: capsule = None) → bool
I noticed in the TensorRT 7 Python API, the execute_async function has an optional argument input_consumed. The doc doesn’t clarify how to use this argument. I want to utilize it so that I can safely copy new input to the page locked host buffer for my next inference as soon as possible. What is this capsule type exactly?