I’m testing EasyOCR (via NVIDIA-AI-IOT/scene-text-recognition) on a TX2 with Jetpack 4.5.0. It is working, but is taking roughly 10 seconds per image for 376x672 input images and 30 seconds per image for 720p images.
I was not able to get torch2trt working, so this is with a torch-only inference engine. I’ve verified that the GPU active during processing by monitoring tegrastats. It appears that almost all of the time is spent in the detection phase, not the recognition phase.
Some things I’ve tried that have had no real impact:
- jetson_clocks and nvpmodel -m 0
- building the torch wheel from source and installing that one instead of the pre-built version that the project points to
- using cudnn_benchmark=True
The same build process on a Xavier NX running Jetpack 4.6 produced a 100x improvement in framerate. So appears either the difference in hardware or the difference in Jetpack version is the most likely culprit. Any idea what could be going on?