How to correctly implement quantization

Hi, experts
I want to implement a quantize function which just perform a quantization used by tensorrt.
my code below

__device__ __forceinline__ int8_t quantize(float val, float quantScale) {
 float s = val / quantScale;
 int32_t res;
 asm volatile("cvt.rni.sat.s8.f32 %0, %1;" : "=r"(res) : "f"(s));
return static_cast<int8_t>(res);
}

but, after I call this function, It outputs some accury errors.
I would like to know whether my implementation is right?
Thank you.