Building Faster Transformer will C++ or Pytorch has any effect on Latencies?


For GitHub - NVIDIA/FasterTransformer: Transformer related optimization, including BERT, GPT
there are multiple options to build.

Query 1: Will choosing C++ or Pytorch affect the latencies observed in anyway?
Query 2: What might be the case where C++ might be preferred to Pytorch or vice versa ?