Hi,
Request you to share the ONNX model and the script if not shared already so that we can assist you better.
Alongside you can try few things:
validating your model with the below snippet
check_model.py
import sys
import onnx
filename = yourONNXmodel
model = onnx.load(filename)
onnx.checker.check_model(model).
2) Try running your model with trtexec command.
In case you are still facing issue, request you to share the trtexec “”–verbose"" log for further debugging
Thanks!
So, I tried to mimic the BART demo that already there in the TensorRT repo, checked the onnx model there are no issues. Just want to know if there is something different that I need to do from the BART conversion because no matter what the tensorrt/onnx model from the conversion outputs only 1 token before outputting the EOS token and the pytorch model used to convert does the otherwise and outputs correct tokens.
The notebook I shared generates the both the ONNX and tensorrt models.
An interesting thing that happens with both ONNX and tensorrt models is that the logits kind of change for the previous tokens as well.
where exactly is NNDF not defined, you just need to run the notebook.
Just make sure whisper.zip is extracted in demo/HuggingFace as Whisper and whisper.ipynb is extracted in demo/HuggingFace/notebooks
Actually, I have extracted Whisper.zip, but when I ran the cell, it raised an error: about not module NNDF. I have considered some modules in Whisper.zip, I see them which is using NNDF. I felt so confused a bit. I have tried to find NNDF on Google, but I can not find them to install.