Slow facial comparison normal?

I have a nano 2gb version and I was running a few tests doing facial comparison. I’m using the face_recognition module. I’ve recompiled dlib with cuda support. My code is below. What’s weird is running a time python3 returns a real time of over 15 secs to load, get face encodings, and compare 2 faces. Running jtop show while this is running, 1 cpu core being at 100% while the gpu briefly jumps to 99% before returning to 0%

Anyone know why this is happening?

My code:

import face_recognition
known_image = face_recognition.load_image_file("katie-1.jpg")
unknown_image = face_recognition.load_image_file("katie-2.jpg")

biden_encoding = face_recognition.face_encodings(known_image)[0]
unknown_encoding = face_recognition.face_encodings(unknown_image)[0]

results = face_recognition.compare_faces([biden_encoding], unknown_encoding)



There is some pre-processing in the dlib.

This step usually uses CPU and is the bottleneck of the whole pipeline.
GPU inference needs to wait for the data and won’t have full utilization.

You can find the bottleneck task with our profiler first: