Hi, I am trying to solve token classification problem using BERT (‘Bert-base-cased’) from Huggingface transformers. I am able to convert ‘bert’ model to ‘onnx’ format and then to ‘tensorrt engine’.
Can somebody share the sample python code to run inference using tensorrt engine.
When I am running predictions on Bert (without tensorrt), I am passing inputs as dictionary to ‘predict’ method… dict_keys([‘labels’, ‘input_ids’, ‘token_type_ids’, ‘attention_mask’])
I am confused how to pass these inputs to tensorrt engine, as it is not accepting it…
What should be input and output shapes I have to give to run predictions?
I am using the code given in below GitHub link:
Environment: TensorRT Version: 220.127.116.11
**GPU Type: Tesla V100-SXM2 Nvidia Driver Version: 460.73.01 CUDA Version: 11.2.2 CUDNN Version: 18.104.22.168 Operating System + Version: ubuntu-20.04.1 Python Version (if applicable): 3.7 TensorFlow Version (if applicable): 2.7 PyTorch Version (if applicable): n/a Baremetal or Container (if container which image + tag): container