Releasing Optimization profile

jasdeepchhabra94 · December 29, 2020, 7:47pm

I am using multiple trt engines in my inference pipeline.For each of those engines,I create execution context once and initially.All these engines have multiple optimization profiles linked with different batch sizes.(Each engine has 8 optimization profiles).These profiles have to be switched depending upon the variable input batch size.While reusing a particular profile in subsequent runs,I get the following error while using context.active_optimization_profile -

[TensorRT] ERROR: Profile 3 has been chosen by another IExecutionContext. Use another profileIndex or destroy the IExecutionContext that use this profile.

According to the documentation-

When multiple execution contexts run concurrently, it is allowed to switch to a profile which was formerly used but already released by another execution context with different dynamic input dimensions.

How to release the optimization profile from a context?I cannot afford to delete and recreate context after every batch as engine.create_execution_context() is slow and incurs a lot of overhead.

Following is my inference code-

def do_inference(context, bindings, inputs, outputs, stream):

    #Copy data to GPU

    [cuda.memcpy_htod_async(inp.device,inp.host,stream) for inp in inputs]

    #Run Inference

    context.execute_async_v2(bindings=bindings,stream_handle=stream.handle)


    #Copy output back to CPU

    [cuda.memcpy_dtoh_async(out.host,out.device,stream) for out in outputs]

    #Synchronize the stream

    stream.synchronize() 

    return [out.host for out in outputs]

Following is my profile switching code that calls the above do_inference method -

#switch optimization profile depending upon batch size
context.active_optimization_profile = dynamic_batch_size-1
#images to host
inputs[dynamic_batch_size-1].host = images
#do inference
output = do_inference(context,bindings,inputs,outputs,stream)
#parse output            
output = output[dynamic_batch_size-1]

Any help would be appriciated.

AakankshaS · December 30, 2020, 12:46pm

Hi @jasdeepchhabra94,
I believe below link should be able to help you
https://docs.nvidia.com/deeplearning/tensorrt/api/python_api/infer/Core/ExecutionContext.html

The IExecutionContext contains shared resource, so if you want parallel execution, you have to create two IExecutionContext , one assigned for each cuda stream.

Thanks!

Topic		Replies	Views
Failed to switch optimisation profiles for execution context on resizable models Jetson TX2 neural-network-framework	5	1404	February 23, 2021
Execution context creation fails with multiple optimization profiles TensorRT tensorrt , cudnn	2	86	February 28, 2026
How to use setOptimizationProfileAsync with executeV2 TensorRT tensorrt	3	1650	October 28, 2021
TensorRT 6: dynamic shapes in thread TensorRT	1	1835	November 27, 2019
Optimization profile not set after creating context {mOptimizationProfile >= 0 && mOptimizationProfile < mEngine.getNbOptimizationProfiles()} TensorRT	3	1082	January 18, 2023
TensorRT and IOptimizationProfile TensorRT tensorrt , developer	3	1198	April 22, 2021
TensorRT7 IExecutionContext with dynamic shapes TensorRT tensorrt	4	711	November 17, 2021
TensorRT: parallel inference when most input shapes locate in the same optimization profile TensorRT tensorrt	7	1251	June 18, 2021
How to use different profile in tensorrt? TensorRT tensorrt , python	3	1534	July 19, 2022
Question about using optimization profiles: bindingIndex 0 is not in profile 1 TensorRT	13	1762	January 17, 2022

Releasing Optimization profile

Related topics