How to reduce the influence from other gpu operation for tensorrt ?

when using the tensorrt engine to infer, it will cast more time when doing other gpu operation like cuda kernel function.
I notice that when adding the cuda karnel function before my tensorrt inferring ,the EMC was used more.
is the time tensorrt cost relate to EMC used ?
how can i do to reduce the influence from EMC or reduce the using of EMC?
i am working on nvidia nx .

Hi,

Please noted that TensorRT also use GPU for inference.
If there is other GPU jobs, TensorRT will need to wait for the resources.

EMC is related to the memory usage.
But in your use case, the slow down is caused by other GPU tasks rather than the memory.

Thanks.