Can someone please explain if TensorRT is beneficial for inference even when AMP and XLA are used ? After a preliminary look at what TnesorRT does, seems to indicate that both does the same things - use Tensor Cores and fp16 .
Ran some tests with AMP enabled with both Tensorflow and TensorRT. AMP improved Tensorflow results slightly. However, AMP had no impact on TensorRT performance. So, my conclusion is TensorRT incorporates benefits of AMP, hence AMP would not provide a performance bump for TensorRT.