we recently migrated the code of a detection algorithm that we developed using the C++ API of TensorRT, to a Jetson Nano platform.
With the same configuration, and same library versions we observe a different behavior in the output predictions; this results in a shift of our predicted bounding box towards the bottom right, plus errors in predicted scores; this happens both with FP32 and FP16 modes;
is there any known precision loss associated to one o more layers when running on Nano?
we tested the same algorithm in many other desktop platforms and never had any issue;
we use CUDA 10.2 and TensorRT 188.8.131.52
thanks in advance,