How to use setOptimizationProfileAsync with executeV2

This question has two parts, and I would appreciate if you could answer both parts.

  1. Are there any examples demonstrating how to use the setOptimizationProfileAsync function with executeV2?
    executeV2 is synchronous (and does not require a CUDA stream), whereas setOptimizationProfileAsync is asynchronous and requires a CUDA stream. Therefore, is there an example of how to use these together? I’d ideally use setOptimizationProfile and keep everything synchronous, but this API function is now deprecated.

  2. In my application, I will be rapidly switching between multiple batch sizes. For example, I’ll run inference with a batch size of 10, 10, 10, 4, 1, 1, 10, 10, 4, 1, 10, 1, 1, 1…

In a situation like the above, is it better to create a single IExecutionContext and change the optimization profile (using setOptimizationProfileAsync) each time the batch size changes?
Or it is better to create 3 separate execution contexts, each with their own optimization profile. Based on the batch size, I’d dispatch the inference request to the appropriate execution context.

Which will be faster? What will be the memory overhead incurred by creating multiple contexts? Is there a speed penalty when switching optimization profile on a context?

Once again, I’d appreciate if you could answer both questions. Thank you

Hi,
Please check the below link, as they might answer your concerns

Thanks!

I wasn’t able to find the answer to either of my questions on the page you linked. I’d appreciate if you could take the time to type out a response to both my questions, the documentation page you linked to has no relevant information.

Hi,

Currently we do not have example.
Hope following may help you.

https://docs.nvidia.com/deeplearning/tensorrt/api/c_api/classnvinfer1_1_1_i_execution_context.html#a74c361a3d93e70a3164988df7d60a4cc

Thank you.