We use AprilTags open source from AprilTags C++ library for barcode scanning, we find the initial frame detection takes around 110ms on TX1, but Intel i7 only needs around 40ms.
If AGV moving speed is 1.5m/s, 110ms means the AGV will move 16.5cm while barcode is detected on TX1, but on i7 only move 6cm long.
So using TX1 may let the AGV miss barcode target easier…
Please help to check if the AprilTags tool can be optimized on TX1 to <50ms for the initial frame.
we find the initial frame detection takes around 110ms on TX1, but Intel i7 only needs around 40ms.
How did you check the initial frame detection time? Do you just run ./build/bin/apriltags_demo ?
Could you try on the latest BSP (R24.2.1)? You can maximize the performance of CPU/GPU/EMC by issuing “$ sudo ./jetson_clocks.sh” and then check it again.
Dear Vickyy,
I tried your suggestion, still cannot improve the performance.
Do you have another idea to improve the performance ?
Another thing I found is that our application doesn’t use CUDA, do you think that will impact the performance in our case?
Can you share the comparison of performance between i7 and tx1 in detail?
I would like to know how you profile these tasks. Can you tell which part do you add log to?
It would be better if you can share the source code. Thanks.
I have used Tegra_System_Profiler from Jetpack to do a profile of your sample code, and found it is a pure cpu app. There is no way raising gpu clk would help your case.
If you can modify those function into cuda one. The performance may become better.
We provide some cuda samples in jetpack. Please take as a reference if needed.