Openai Whisper Tensorrt

Description

Openai whisper converts to tensorrt but only outputs a single token correct and outputs EOS token after that.

Environment

TensorRT Version: 8.6.1
GPU Type: dGPU
Nvidia Driver Version: 535
CUDA Version: 12.2

Baremetal or Container (if container which image + tag): nvcr.io/nvidia/tensorrt_23.09-py3/

Relevant Files

whisper_ipynb.zip (20.5 KB)
Whisper (2).zip (43.3 KB)

Steps To Reproduce

  1. Clone nvidia Tensorrt repo
  2. Extract Whisper.zip inside demo/HuggingFace
  3. Extract whisper_ipynb.zip in demo/HuggingFace/notebooks
  4. Run all cells whisper.ipynb in demo/HuggingFace/notebooks

I followed BART and converted Whisper to TensorRT but my decoder only generated a single token before outputing EOS token.

THanks in advance

Hi,
Request you to share the ONNX model and the script if not shared already so that we can assist you better.
Alongside you can try few things:

  1. validating your model with the below snippet

check_model.py

import sys
import onnx
filename = yourONNXmodel
model = onnx.load(filename)
onnx.checker.check_model(model).
2) Try running your model with trtexec command.

In case you are still facing issue, request you to share the trtexec “”–verbose"" log for further debugging
Thanks!

So, I tried to mimic the BART demo that already there in the TensorRT repo, checked the onnx model there are no issues. Just want to know if there is something different that I need to do from the BART conversion because no matter what the tensorrt/onnx model from the conversion outputs only 1 token before outputting the EOS token and the pytorch model used to convert does the otherwise and outputs correct tokens.

The notebook I shared generates the both the ONNX and tensorrt models.

An interesting thing that happens with both ONNX and tensorrt models is that the logits kind of change for the previous tokens as well.

I also follow by your step. But i see NNDF is not defined. Could you tell me how to install them.

where exactly is NNDF not defined, you just need to run the notebook.
Just make sure whisper.zip is extracted in demo/HuggingFace as Whisper and whisper.ipynb is extracted in demo/HuggingFace/notebooks

Actually, I have extracted Whisper.zip, but when I ran the cell, it raised an error: about not module NNDF. I have considered some modules in Whisper.zip, I see them which is using NNDF. I felt so confused a bit. I have tried to find NNDF on Google, but I can not find them to install.

NNDF already exists in demo/HuggingFace here https://github.com/NVIDIA/TensorRT/tree/release/8.6/demo/HuggingFace/NNDF

Oh, I see. Many thanks. I have the same task with you. Covert model to TensorRT. So we can discuss it together

sure, let me know if we can connect maybe on discord?

Oke guy, Could you connect me on discord by: ngoctuhan12