Error Code 10: Internal Error (Could not find any implementation for node {ForeignNode[lm_head.bias.../Cast]}.)


Hi there,

i want to convert an static quantized transformer model to trt. It is an CodeGenForCausalLM from transformers library. I used the following command for converting the model.

 trtexec --onnx=trt/decoder_model_quantized.onnx --int8 --minShapes=input_ids:1x1,attention_mask:1x1 --maxShapes=input_ids:1x512,attention_mask:1x512 --saveEngine=model-quantized.onnx.plan --device=0 --allowGPUFallback --useCudaGraph

The error is as follows:

03/22/2023-14:01:53] [I] [TRT] Local timing cache in use. Profiling results in this builder pass will not be stored.
[03/22/2023-14:02:03] [W] [TRT] Myelin graph with multiple dynamic values may have poor performance if they differ. Dynamic values are: 
[03/22/2023-14:02:03] [W] [TRT]  (- 0 (CAST_F_TO_I (FLOOR (DIV_F (MUL_ADD_F -1 (CAST_I_TO_F sequence_length) 0) 1))))
[03/22/2023-14:02:03] [W] [TRT]  sequence_length
[03/22/2023-14:02:04] [W] [TRT] Skipping tactic 0x0000000000000000 due to Myelin error: [canonicalize_axis] Operation /transformer/ln_f/Constant_1_output_0_QuantizeLinear has out of range axis value 0.
[03/22/2023-14:02:04] [E] Error[10]: [optimizer.cpp::computeCosts::3728] Error Code 10: Internal Error (Could not find any implementation for node {ForeignNode[lm_head.bias.../Cast]}.)
[03/22/2023-14:02:04] [E] Error[2]: [builder.cpp::buildSerializedNetwork::751] Error Code 2: Internal Error (Assertion engine != nullptr failed. )
[03/22/2023-14:02:04] [E] Engine could not be created from network
[03/22/2023-14:02:04] [E] Building engine failed
[03/22/2023-14:02:04] [E] Failed to create engine from model or file.
[03/22/2023-14:02:04] [E] Engine set up failed


Docker Image:
TensorRT Version 8.501
Python 3.8.10
transformers 4.26.1
optimum 1.7.1
onnx 1.12.0
onnxruntime-gpu 1.14.1
pytorch-quantization 2.1.2
pytorch-triton 2.0.0+b8b470bc59
torch 1.13.1
torch-tensorrt 1.3.0
torchtext 0.13.0a0+fae8e8c
torchvision 0.15.0a0

Steps To Reproduce

  1. Convert the model to transformer model to onnx via optimum-cli
  2. Do the quantization (we did it exactly) like in quantization guide
  3. Run as suggested by trtexec.
  4. Run the cmd from above.

Thank you very much for any help!

Hi ,
We recommend you to check the supported features from the below link.

You can refer below link for all the supported operators list.
For unsupported operators, you need to create a custom plugin to support the operation


Thank you for your reply.

It seems that lm_head.bias.../Cast is not in the list of supported files, right?