I followed AastaLLL’s instruction in the ‘Caffe failed with py-faster-rcnn demo.py on TX1’ post, and was able to build and run py-faster-rcnn demo script on Jetson TX1.
However, comparing to a GeForce GPU card, the inference performance on JTX1 is lackluster. More specifically, it takes roughly 1.8s to processing each image in the py-faster-rcnn demo on JTX1. In contrast, that takes only ~0.09s on my x64 PC with a GTX-1080 graphics card. I have tried to force JTX1 to always run at maximum clock speeds by running the ~/jetson_clocks.sh script, but that doesn’t help too much. For other DNN/CNN tasks, I typically see less than 10X performance difference between JTX1 and the GTX-1080 PC. But for this py-faster-rcnn case, JTX1 is really falling behind.
Are there any suggestions about how to improve py-faster-caffe inference performance on JTX1? Thanks.
Screenshot of JTX1 case:
Screenshot of GTX-1080 case: