How to compile Sentence Transformer with Torch-TensorRT?

joaopcm1996 · August 29, 2022, 10:16am

I would like to use the .encode() method provided by SentenceTransformer models, and compile the model (or the encode method itself, reference here) with TensorRT for better GPU performance.

Torch-TensorRT docs show that I would need to provide a torch script Module for compilation with TensorRT, but it seems that it is currently not possible to torch.jit.script() a SentenceTransformer.

How can I compile the model with TensorRT, in a way where I can still use the .encode() method? Is this possible?

spolisetty · September 2, 2022, 5:35am

Hi,

You may be able to use torch.jit.trace to generate the touch script.

Thank you.

Topic		Replies	Views
Need help in torch tensorrt TensorRT	2	439	September 7, 2023
Best way to convert PyTorch to TensorRT model TensorRT cudnn	6	3545	June 14, 2024
TensorRT from inference graph TensorRT	1	697	October 22, 2019
TensorRT .trt custom model inference TensorRT	2	448	December 4, 2020
Why TensorRT model is slower? TensorRT tensorrt	3	1361	June 20, 2022
Model Conversion to TensorRT TensorRT tensorrt	1	624	August 1, 2023
Can etlt models be used with TensorRT C++ API? TAO Toolkit	3	1030	October 12, 2021
Can I use tf-trt with C++ API? TensorRT	1	787	November 11, 2019
TF-TRT TensorRT	1	954	February 24, 2019
TensorRT vS Torchscript for inference TensorRT	1	701	February 5, 2021

How to compile Sentence Transformer with Torch-TensorRT?

Related topics