I would be interested in finding out how the speed of deep learning inference on the Jetson Nano in the Nvidia blog(https://devblogs.nvidia.com/jetson-nano-ai-computing/) about the Jetson Nano(https://devblogs.nvidia.com/wp-content/uploads/2019/03/imageLikeEmbed-1024x510.png) can be reproduced.
For example: SSD Mobilenet SSD-V2(300x300) on the Jetson Nano performs at 39 fps which is faster than the TensorRT performance on the Jetson TX2 I have access to which performs at about 20 fps(this is similar in performance as the benchmarks(https://github.com/NVIDIA-AI-IOT/tf_trt_models#models-1) listed in the Nvidia tf_trt_repository.
The review on Phoronix ranks Jetson Nano deep learning inference performance consistently below Jetson TX2 performance: https://www.phoronix.com/scan.php?page=article&item=nvidia-jetson-nano&num=3 . This seems about right as the Nano has an older Maxwell architecture with half the amount of CUDA cores
Are there any new TensorRT optimizations on the Nano? How can the performance statistics from the Nvidia blog be reproduced?