Program running time is too long ( tensorflow and yolov4 )

My ENV:

Jetson Nano 4GB.
R32 (release), REVISION: 4.4, GCID: 23942405, BOARD: t210ref, EABI: aarch64
Tensorflow  1.15.4.
Keras 2.3.1

I use yolo4 based on tensorflow to predict.

When the program is based on CPU, the prediction time of each image is about 15 seconds.

When the program is based on GPU, the prediction time becomes 50 seconds to 500 seconds.

How can I solve this problem?
Thanks a lot

Hi,

Please check if you have maximized the device performance first.

$ sudo nvpmodel -m 0
$ sudo jetson_clocks

Then please monitor the application memory usage with tegrastats.
If the app uses swap memory for deploying, it will impact the performance due to the IO limitation.

Thanks.