Covert model Whisper to ONNX

ngoctuhan2023 · October 20, 2023, 3:47pm

Description

I am working to convert model Whisper (small) to TensorRT. My steps follow some steps below:

Sperated 2 models from whisper: Encoder and Decoder.
I convert 2 models into ONNX
I use trtexec to transform ONNX to a .engine file.
But I see that the decoder didn’t contain Beam search. So it can not return token_id directly.
I want to make a full pipeline to run them by TensorRT. Please help me.

Environment

TensorRT Version: 8.6.1
GPU Type: T4
Nvidia Driver Version: 12
CUDA Version: 12
CUDNN Version:
Operating System + Version: Ubuntu 22.04
Python Version (if applicable): 3.10.10
TensorFlow Version (if applicable):
PyTorch Version (if applicable): 2.0.1
Baremetal or Container (if container which image + tag):

Relevant Files

Please attach or include links to any models, data, files, or scripts necessary to reproduce your issue. (Github repo, Google Drive, Dropbox, etc.)

Steps To Reproduce

Please include:

Exact steps/commands to build your repro
Exact steps/commands to run your repro
Full traceback of errors encountered

luxiaoz · October 22, 2023, 1:31am

In release/9.0, we have added support for Vision2Seq model (BLIP). The logic is very similar for Whisper to run beam search, etc. You will need to create an object similar to GenerationMixin in HuggingFace to run beam search, or you can implement the logic by yourself. TRT can generate from input_ids+encoder_hidden_states to logits. TensorRT/demo/HuggingFace/BLIP at release/9.1 · NVIDIA/TensorRT (github.com)

Topic		Replies	Views
TensorRT output full of NaN TensorRT	1	438	October 19, 2023
Openai Whisper Tensorrt TensorRT tensorrt	9	2115	October 18, 2023
Cannot convert onnx REID model TensorRT cudnn	2	180	June 25, 2024
Conversion from ONNX to TensorRT fails TensorRT	1	821	March 3, 2021
Conversion from onnx to TensorRT engine TensorRT tensorrt , cuda	1	485	July 24, 2023
Mod operator unsupported in TensorRT 8.4.1 (included w/ Jetpack 5.0.2) TensorRT jetpack , tensorrt , cuda , jetson-inference , onnx	5	1562	January 2, 2023
Convert Faster RCNN Tensorflow model to TensorRT? TensorRT	2	788	March 24, 2021
Unsupported onnx type DOUBLE 11 TensorRT	5	909	October 12, 2021
Detectron2: faster inferencing TensorRT	2	1412	April 29, 2022
ONNX to TensorRT conversion TensorRT	3	718	July 6, 2023

Covert model Whisper to ONNX

Description

Environment

Relevant Files

Steps To Reproduce

Related topics