I am using multiple trt engines in my inference pipeline.For each of those engines,I create execution context once and initially.All these engines have multiple optimization profiles linked with different batch sizes.(Each engine has 8 optimization profiles).These profiles have to be switched depending upon the variable input batch size.While reusing a particular profile in subsequent runs,I get the following error while using
[TensorRT] ERROR: Profile 3 has been chosen by another IExecutionContext. Use another profileIndex or destroy the IExecutionContext that use this profile.
According to the documentation-
When multiple execution contexts run concurrently, it is allowed to switch to a profile which was formerly used but already released by another execution context with different dynamic input dimensions.
How to release the optimization profile from a context?I cannot afford to delete and recreate context after every batch as
engine.create_execution_context() is slow and incurs a lot of overhead.
Following is my inference code-
def do_inference(context, bindings, inputs, outputs, stream): #Copy data to GPU [cuda.memcpy_htod_async(inp.device,inp.host,stream) for inp in inputs] #Run Inference context.execute_async_v2(bindings=bindings,stream_handle=stream.handle) #Copy output back to CPU [cuda.memcpy_dtoh_async(out.host,out.device,stream) for out in outputs] #Synchronize the stream stream.synchronize() return [out.host for out in outputs]
Following is my profile switching code that calls the above do_inference method -
#switch optimization profile depending upon batch size context.active_optimization_profile = dynamic_batch_size-1 #images to host inputs[dynamic_batch_size-1].host = images #do inference output = do_inference(context,bindings,inputs,outputs,stream) #parse output output = output[dynamic_batch_size-1]
Any help would be appriciated.