Is OpenCV really using the GPU for detection?

Hi there.
Few moments ago I’ve finally discovered my solution. Now my script is using CUDA cores!
Before the modification I had 1.25 fps average with 416x416 input blob feed and now is running at 6.5 fps at the same input size. Big improvement considering that is an edge device.
In order to truly use CUDA cores is needed to add the following two lines after
net = cv2.dnn.readNetFromDarknet(cfgPath,weightsPath)

net.setPreferableBackend(cv2.dnn.DNN_BACKEND_CUDA)
net.setPreferableTarget(cv2.dnn.DNN_TARGET_CUDA)

I did not know that this two lines where needed to properly work with GPU
This is the page that pointed me in right direction: How to use OpenCV DNN Module with NVIDIA GPUs on Linux

And this are my current GPU graphs in JTOP app:

And CPU not taking all the math like before:

Even so, many thanks to @AastaLLL for all the replies and taking your time to solve my issue.

Something I do not understand is why the learnopencv blog uses argparse method in their python script to activate CUDA, can someone explain that?
This are fragments of the script:

import argparse

parser = argparse.ArgumentParser(description=‘Run keypoint detection’)
parser.add_argument(“–device”, default=“cpu”, help=“Device to inference on”)

net = cv2.dnn.readNetFromCaffe(protoFile, weightsFile)

if args.device == “cpu”:
net.setPreferableBackend(cv2.dnn.DNN_TARGET_CPU)
print(“Using CPU device”)
elif args.device == “gpu”:
net.setPreferableBackend(cv2.dnn.DNN_BACKEND_CUDA)
net.setPreferableTarget(cv2.dnn.DNN_TARGET_CUDA)
print(“Using GPU device”)

Thanks