NVIDIA Developer Forums

Building Faster Transformer will C++ or Pytorch has any effect on Latencies?

AI & Data Science Other Products (closed) Miscellaneous Products (archived)

hemant.hbti October 19, 2022, 7:18am 1

Hi,

For GitHub - NVIDIA/FasterTransformer: Transformer related optimization, including BERT, GPT
there are multiple options to build.

Query 1: Will choosing C++ or Pytorch affect the latencies observed in anyway?
Query 2: What might be the case where C++ might be preferred to Pytorch or vice versa ?

Topic		Replies	Views	Activity
Will there be any advantage in inference speed if I use python to execute the inference when the .plan was created with C++? Jetson TX2 python	4	854	October 18, 2021
TensorRT: Python vs C++ TensorRT	1	1573	October 10, 2018
When to use demoBERT implemented over TensorRT vs BERT using Faster Transformers? TensorRT	2	790	August 22, 2022
0-copy PCI-E bridge CUDA Programming and Performance	0	842	May 14, 2009
Aysnc v Synch, is there a big performance difference? General Topics and Other SDKs	0	360	June 22, 2020
PTX coding runtime gains over CUDA coding CUDA Programming and Performance	4	699	April 18, 2016
Using DeepStream Vs. Not using it Jetson AGX Orin tensorrt , jetson-inference , gstreamer , python	4	737	October 9, 2023
Inference speed of ONNX vs. ONNX + TensorRT TensorRT tensorrt , onnx	3	1250	January 16, 2023
TF and Pytorch are slower on Windows than on linux CUDA Programming and Performance	7	3106	July 2, 2019
Overclocked video cards, better for CUDA CUDA Programming and Performance	4	7490	December 22, 2009