I am facing issue while converting T5 base model using the steps in the blog Optimizing T5 and GPT-2 for Real-Time Inference with NVIDIA TensorRT | NVIDIA Technical Blog. I was able to convert the T5 -small model to TRT using the above blog and the associated notebook.
Below is the issue I am facing when converting T5-base to TRT:
PolygraphyException Traceback (most recent call last)
1 t5_trt_encoder_engine = T5EncoderONNXFile(
2 os.path.join(onnx_model_path, encoder_onnx_model_fpath), metadata
----> 3 ).as_trt_engine(os.path.join(tensorrt_model_path, encoder_onnx_model_fpath) + “.engine”)
5 t5_trt_decoder_engine = T5DecoderONNXFile(
in func_impl(network, config, save_timing_cache)
/usr/local/lib/python3.7/dist-packages/polygraphy/logger/logger.py in critical(self, message)
347 from polygraphy.exception import PolygraphyException
→ 349 raise PolygraphyException(message) from None
351 def internal_error(self, message):
PolygraphyException: Invalid Engine. Please ensure the engine was built correctly
I have also tried increasing precision to FP 32. But still getting the same issue.