BERT is a model that could be complex enough that it saturates the A100 (without MIG). If that is the case, then switching inference to a MIG instance that is basically 1/2 of an A100 could result in longer processing time and therefore longer latency.
No latency increase is expected simply due to the usage of MIG, or not. But if the MIG instance you select cannot process the inference request in the same amount of time, then latency will increase.
For example, I would expect very little latency difference in doing a single RN50 (batch size 1) inference on a “full” A100 vs. a MIG “instance” of A100. But for other more complex models there may be differences.
There is a MIG user guide available. Detailed TRT questions should be asked on the TRT forum.
You may also wish to review this for best practices, which will require sign-up/log-in.