Openai Whisper Tensorrt

y14uc339 · October 16, 2023, 9:09pm

Description

Openai whisper converts to tensorrt but only outputs a single token correct and outputs EOS token after that.

Environment

TensorRT Version: 8.6.1
GPU Type: dGPU
Nvidia Driver Version: 535
CUDA Version: 12.2

Baremetal or Container (if container which image + tag): nvcr.io/nvidia/tensorrt_23.09-py3/

Relevant Files

whisper_ipynb.zip (20.5 KB)
Whisper (2).zip (43.3 KB)

Steps To Reproduce

Clone nvidia Tensorrt repo
Extract Whisper.zip inside demo/HuggingFace
Extract whisper_ipynb.zip in demo/HuggingFace/notebooks
Run all cells whisper.ipynb in demo/HuggingFace/notebooks

I followed BART and converted Whisper to TensorRT but my decoder only generated a single token before outputing EOS token.

THanks in advance

AakankshaS · October 17, 2023, 8:39am

Hi,
Request you to share the ONNX model and the script if not shared already so that we can assist you better.
Alongside you can try few things:

validating your model with the below snippet

check_model.py

import sys
import onnx
filename = yourONNXmodel
model = onnx.load(filename)
onnx.checker.check_model(model).
2) Try running your model with trtexec command.

In case you are still facing issue, request you to share the trtexec “”–verbose"" log for further debugging
Thanks!

y14uc339 · October 17, 2023, 8:46am

So, I tried to mimic the BART demo that already there in the TensorRT repo, checked the onnx model there are no issues. Just want to know if there is something different that I need to do from the BART conversion because no matter what the tensorrt/onnx model from the conversion outputs only 1 token before outputting the EOS token and the pytorch model used to convert does the otherwise and outputs correct tokens.

The notebook I shared generates the both the ONNX and tensorrt models.

An interesting thing that happens with both ONNX and tensorrt models is that the logits kind of change for the previous tokens as well.

trung.hoang · October 18, 2023, 10:41am

I also follow by your step. But i see NNDF is not defined. Could you tell me how to install them.

y14uc339 · October 18, 2023, 12:33pm

where exactly is NNDF not defined, you just need to run the notebook.
Just make sure whisper.zip is extracted in demo/HuggingFace as Whisper and whisper.ipynb is extracted in demo/HuggingFace/notebooks

ngoctuhan2023 · October 18, 2023, 3:19pm

Actually, I have extracted Whisper.zip, but when I ran the cell, it raised an error: about not module NNDF. I have considered some modules in Whisper.zip, I see them which is using NNDF. I felt so confused a bit. I have tried to find NNDF on Google, but I can not find them to install.

y14uc339 · October 18, 2023, 3:27pm

NNDF already exists in demo/HuggingFace here https://github.com/NVIDIA/TensorRT/tree/release/8.6/demo/HuggingFace/NNDF

ngoctuhan2023 · October 18, 2023, 3:33pm

Oh, I see. Many thanks. I have the same task with you. Covert model to TensorRT. So we can discuss it together

y14uc339 · October 18, 2023, 3:41pm

sure, let me know if we can connect maybe on discord?

ngoctuhan2023 · October 18, 2023, 3:55pm

Oke guy, Could you connect me on discord by: ngoctuhan12

Topic		Replies	Views
Covert model Whisper to ONNX TensorRT cudnn	1	1263	October 22, 2023
LSTM ONNX to TensorRT mismatched outputs TensorRT tensorrt	3	977	September 29, 2022
I can't get result from TensorRT model TensorRT tensorrt	8	1018	May 31, 2022
Convert wav2vec2 (ONNX) using trtexec raise error : free(): double free detected in tcache 2 Aborted (core dumped) TensorRT tensorrt , onnx	2	1261	December 13, 2021
Convert Faster RCNN Tensorflow model to TensorRT? TensorRT	2	794	March 24, 2021
Parseq tensorrt conversion takes for ever to complete TensorRT cudnn	1	41	August 30, 2024
Help converting a pytorch model to TensorRT Jetson Xavier NX tensorrt , pytorch	6	2902	October 18, 2021
TensorRT with BART TensorRT	3	1159	January 17, 2022
What is a correct configuration for TensorRT for my Speech-To-Text PyTorch model? TensorRT tensorrt , inception	1	519	January 6, 2023
The inference result of Conformer Encoder is wrong TensorRT	6	1258	March 2, 2022