Haar cascade with cuda xml classifier doesn't work

oleksanr.v · June 4, 2021, 10:44am

I have installed opencv 4.5.1 and tensorflow 2.4 in docker container.JetPack version is 4.5.1. I try to use opencv cuda classifiers from official repo [https://github.com/opencv/opencv/tree/master/data/haarcascades_cuda]
In application I use Python to fetch xml file:

self.open_eyes_detector = cv2.cuda_CascadeClassifier.create(BASE_DIR + '/models/cuda/haarcascade_eye_tree_eyeglasses.xml')

And then Multiscale detector:

gpu_gray_face = cv2.cuda_GpuMat(gray_face)
open_eyes_glasses_result = self.open_eyes_detector.detectMultiScale(gpu_gray_face).download()

But only smiles detector is working. Every other haar classifier gives error

cv2.error: OpenCV(4.5.1) /tmp/build_opencv/opencv_contrib/modules/cudaobjdetect/src/cascadeclassifier.cpp:155: error: (-217:Gpu API call) NCV Assertion Failed: cudaError_t=702, file=/tmp/build_opencv/opencv_contrib/modules/cudale gacy/src/cuda/NCVHaarObjectDetection.cu, line=1157 in function ‘NCVDebugOutputHandler’

./deviceQuery test passes so I can conclude that cuda drivers are working. What can be the reason of this error?

DaneLLL · June 7, 2021, 2:35am

Hi,
Are you able to try other CUDA filters, such as

filter = cv::cuda::createSobelFilter(CV_8UC4, CV_8UC4, 1, 0, 3, 1, cv::BORDER_DEFAULT);

or

filter = cv::cuda::createGaussianFilter(CV_8UC4, CV_8UC4, cv::Size(31,31), 0, 0, cv::BORDER_DEFAULT);

Would like to know if the failure is specific to calling cascade classifier.

oleksanr.v · June 7, 2021, 7:06am

Those filters created successfully:

cv2.cuda.createGaussianFilter(cv2.CV_8UC4, cv2.CV_8UC4, (31, 31), 0, 0, cv2.BORDER_DEFAULT)
<cuda_Filter 0x7f962cbe90>
cv2.cuda.createSobelFilter(cv2.CV_8UC4, cv2.CV_8UC4, 1, 0, 3, 1, cv2.BORDER_DEFAULT)
<cuda_Filter 0x7f9cc06330>

DaneLLL · June 8, 2021, 5:39am

Hi,
Does cascade classifier work outside docker? Would like to know if it works if you call the functions on Jetson Nano directly, without docker.

oleksanr.v · June 8, 2021, 6:36pm

The same error is on Jetson Nano, without dockers. And after error other calls to cv2 doesn`t work.

cv2.error: OpenCV(4.4.0) /tmp/build_opencv/opencv/modules/core/src/cuda/gpu_mat.cu:121: error: (-217:Gpu API call) the launch timed out and was terminated in function 'allocate'

DaneLLL · June 9, 2021, 2:27am

Hi,
Looks like it fails in cudaMalloc():
https://github.com/opencv/opencv/blob/master/modules/core/src/cuda/gpu_mat.cu#L121

Since cascade classifier downscales images for multiple times and probably memory is insufficient.

Not sure if it helps but please try light LXDE:
Save 1GB of Memory! Use LXDE on your Jetson - JetsonHacks
This shall provide more free memory. May be able to have enough memory for running cascade classifier.

oleksanr.v · June 9, 2021, 8:11am

Good suggestion, but I totally disabled GUI, and there are 2.5 Gb free RAM memory and 4 Gb free SWAP. When I perform classification none of them starts decrease.
Looks like this error is related to CU_DEVICE_ATTRIBUTE_KERNEL_EXEC_TIMEOUT
Here it says that all existing device memory allocations from this context are invalid
https://simoncblyth.bitbucket.io/env/notes/cuda/cuda_timeouts/
But how to figure out why does it take so long to process gpu frame. Can there be some more detailed logs?