I’m trying to see how much faster the TX2 can classify images with Tensorflow than a Raspberry Pi model 3. My tensorflow model was developed with transfer learning from the InceptionV3 CNN. My RPi takes about 20 seconds to classify an image, and to my surprise the TX2 also takes about 20 seconds. Here is the python script I’m using to classify the image:
Here’s what the output looks like when I run that script:
Am I doing something wrong here? I was expecting the TX2 to drastically outperform the RPi.
Please remember to maximize CPU/GPU frequency to have better performance.
To accelerate the performance of Tensorflow, you can inference a TF model with our fast TensorRT engine.
More information about TensorRT can be found here:
Please remember to export a UFF model on x86-based Linux machine first, and run the UFF model with tensorRT on TX2.
The output from my tensorflow script indicates that it’s running on the GPU (see gist). I doubt the default GPU frequency is the limiting factor here. Nvidia’s TensorRT image classification examples run screaming fast (like, 20 image classifications per second, fast). I think there’s something wrong with how I installed tensorflow if it can only classify 1 image every 20 seconds.
Please check if TensorFlow uses the swap memory.
What do u mean by check if tensorflow uses swap memory ?
Whats the effect of swap memory ?
Please check it via tegrastats:
How about your result?
I met the same issue.
Here are two suggestions for accelerating inferencing:
- Maximize the CPU/GPU clock
- Inference your model with TensorRT