[Hugging Face transformer models + pytorch_quantization] PTQ quantization int8 is slower than fp16

thank you, I fixed the issue and made it a lib GitHub - ELS-RD/transformer-deploy: Efficient, scalable and enterprise-grade CPU/GPU inference server for 🤗 Hugging Face transformer models 🚀