Is it possible to run fp16 for first and last layers and run int8 for all other layers?

Hi,

As the title says, is it possible to run fp16 for first and last layers and run int8 for all other layers? If so, is there a way to do it with PyTorch?

Best

Hi @zhihaosooner ,
Apologies for the delay, checking this with the team. Shall share the update soon.

Thanks

Hi @zhihaosooner ,
Yes, that is possible, you need to cast layers you need to fp16 and keep others in int8.
This function is an example of how to force precision with TRT. Here is how TRT INetwork is created from ONNX. This is how you can set 0th input of ONNX to whatever precision you want.
Thanks

Thanks