I play around with the OpenCV dnn module on both CPU and GPU on Jetson Nano. I measure the time of execution of super-resolution algorithms based on four different models: EDSR, ESPCN, FSRCNN, LapSRN. I use the following code:
import cv2
from time import time
sr = cv2.dnn_superres.DnnSuperResImpl_create()
path = "EDSR_x2.pb"
# path = "ESPCN_x4.pb"
# path = "FSRCNN_x4.pb"
# path = "LapSRN_x4.pb"
sr.readModel(path)
# Set CUDA backend and target to enable GPU inference
sr.setPreferableBackend(cv2.dnn.DNN_BACKEND_CUDA)
sr.setPreferableTarget(cv2.dnn.DNN_TARGET_CUDA)
sr.setModel("edsr", 2)
# sr.setModel("espcn", 4)
# sr.setModel("fsrcnn", 4)
# sr.setModel("lapsrn", 4)
img = cv2.imread('butterfly.png')
start = time()
result = sr.upsample(img)
print(time() - start)
cv2.imwrite('edsr_output.png', result)
# cv2.imwrite('espcn_output.png', result)
# cv2.imwrite('fsrcnn_output.png', result)
# cv2.imwrite('lapsrn_output.png', result)
As an input image I use “butterfly” with resolution 232 px x 155 px (from this link https://miro.medium.com/max/464/1*A8yToxEh-f0_1Up8u51aHQ.png).
The output time is as follows:
- EDSR x2: CPU: can’t finish, GPU: can’t finish.
- ESPCN x4: CPU: 0.17469215393066406 s, GPU: 10.169917821884155 s
- FSRCNN x4: CPU: 0.12776947021484375 ,GPU: 5.2502007484436035
- LapSRN x4: CPU: 8.098081111907959, GPU: 6.410776138305664
For some reason it is impossible to run the EDSR model - it took too long and the program exits.
ESPCN and FSRCNN are MUCH faster on CPU. I don’t understand why it happens.
Only LapSRN is faster on GPU, but it isn’t a siginificant improvement.
Is those result normal?
Why the GPU performance is that poor with CPU (which in Nvidia Jetson Nano isn’t powerful)?