How to compile Sentence Transformer with Torch-TensorRT?

I would like to use the .encode() method provided by SentenceTransformer models, and compile the model (or the encode method itself, reference here) with TensorRT for better GPU performance.

Torch-TensorRT docs show that I would need to provide a torch script Module for compilation with TensorRT, but it seems that it is currently not possible to torch.jit.script() a SentenceTransformer.

How can I compile the model with TensorRT, in a way where I can still use the .encode() method? Is this possible?

Hi,

You may be able to use torch.jit.trace to generate the touch script.

Thank you.