Hi, my input data and weights are in the fp16 data format during the convolution inference process. Will fp32 occur during the convolution calculation process? How is it intercepted from fp32 to fp16?
As shown in the figure, I used the profile tool to see that the conv operation calls these kernel functions. Does this mean that the convolution operation of fp16 involves truncation from fp32 to fp16? How is it truncated? In which computing node does it occur? , is it accumulated and finally truncated to fp16?Looking forward to your answer,thank you!
Hi,
Can you try running your model with trtexec command, and share the “”–verbose"" log in case if the issue persist
You can refer below link for all the supported operators list, in case any operator is not supported you need to create a custom plugin to support that operation
Also, request you to share your model and script if not shared already so that we can help you better.
Meanwhile, for some common errors and queries please refer to below link: