How to correctly implement quantization

Hi, experts
I want to implement a quantize function which just perform a quantization used by tensorrt.
my code below

__device__ __forceinline__ int8_t quantize(float val, float quantScale) {
 float s = val / quantScale;
 int32_t res;
 asm volatile("cvt.rni.sat.s8.f32 %0, %1;" : "=r"(res) : "f"(s));
return static_cast<int8_t>(res);
}

but, after I call this function, It outputs some accury errors.
I would like to know whether my implementation is right?
Thank you.

Hi @JeremyYuan ,
Can you please help us with the supported script and model?

but How?

you may upload it to drive and share the link with us?

sorry, several days ago, bug was found in other logic of my code. this code proved to be OK.
I should update this info.
Thanks again.