As mentioned in the above link, I found that executing yolov5s half-precision is slower than full-precision on jetson agx orin. The same code is indeed faster than full-precision on jetson xavier nx. It is normal. May I ask? What are the reasons?
The code is executed under the pytorch framework. I just tested tensort’s FP32 and FP16 and found that it is normal. fp16 is faster than fp32. What may be the cause of the problem?