issues with dlib library

kp688 · April 6, 2019, 6:22am

I am trying to run a python script on Jetson Nano board, which performs facial detection and embeddings calculation using dlib library. When I try to print the embeddings. The values of 128 embeddings are displaying in very large number sometimes and sometimes as NaN. Same script is displaying values properly in all other devices like Tx2 or i386/amd64 Linux machine.

I have tried installing dlib using both methods like, “pip install dlib” as well as building from source. Both cases, I am getting similar result.

Could anyone please suggest, how this problem can be resolved?

Thanks in advance

AastaLLL · April 8, 2019, 1:20am

Hi,

To give further information, would you mind to share your script with us?
Thanks.

kp688 · April 8, 2019, 6:29am

Please find the code below

import face_recognition
import cv2
import os
import numpy as np
import dlib 

face_locations = []
face_encodings = []

### Path where images are present for testing
imagefolderpath = "Images/"

### Model for face detection
face_detector = dlib.get_frontal_face_detector()

for image in os.listdir(imagefolderpath):
    image = cv2.imread(os.path.join(imagefolderpath,image),1)
    faces = face_detector(image,0)
    for face in faces:
        x,y,w,h = face.left(),face.top(),face.right(),face.bottom()
        face_locations.append((x,y,h,w))
    face_encodings = face_recognition.face_encodings(image, known_face_locations = face_locations, num_jitters = 1)
    print(face_encodings)

    for (left, top, bottom, right) in face_locations:
        cv2.rectangle(image, (left,top), (right, bottom), (0, 0, 255), 2)
        cv2.imshow('Image', image)
        cv2.waitKey(0)
        cv2.destroyAllWindows()

AastaLLL · April 9, 2019, 7:08am

Hi,

We are installing the dlib library and it will take some time.
At the same time, could you share the dlib and OpenCV version of Nano and TX2 with us?

Thanks.

Safa_V · April 10, 2019, 4:44pm

I have exactly the same issue. I have been testing C++ code and get nan or very large number as the descriptors output.

dlib 19.17

abdu307 · April 11, 2019, 7:41am

I have the same issue, my code runs on Tx1, Tx2, and Xavier without problems, but it produces same error on nano,
I tried both dlib ver. 19.16, and ver. 19.17.
actually “face_recognition.batch_face_locations” function outputs correct location of faces using “CNN”
but the issue is with this function “face_recognition.face_encodings”! the output is very large numbers or NaNs.

kp688 · April 11, 2019, 7:51am

Hi,

We tested the script on opencv 3 and dlib 19.17 version.

Like abdu307 said, we also facing issue only when calculating embeddings. face detection and identifying the face locations is working fine

AastaLLL · April 12, 2019, 8:24am

Hi,

We originally thought that this may be caused by a different OpenCV or dlib version across the platform.
But looks not.

For the usecase error occurs, guess that it may be related to the OOM.
Could you try to run your application with cuda-memcheck to get more information?

cuda-memcheck python myApp.py

Thanks.

kp688 · April 14, 2019, 5:44am

Hi,

I have ran the memcheck command with my python script and I got the following output.

========= CUDA-MEMCHECK
========= Internal Memcheck Error: Initialization failed
=========     Saved host backtrace up to driver entry point at error
=========     Host Frame:/usr/lib/aarch64-linux-gnu/tegra/libcuda.so.1 (cuDevicePrimaryCtxRetain + 0x154) [0x1fd7d4]
=========     Host Frame:/usr/local/lib/python3.6/dist-packages/dlib.cpython-36m-aarch64-linux-gnu.so [0x8389c4]
=========

AastaLLL · April 18, 2019, 1:38am

Hi,

Thanks for your log. Looks like not a memory issue.

We are checking the dlib source code and still need some time to give a suggestion.
http://dlib.net/files/dlib-19.6.tar.bz2

Say tuned.

zhougz · April 25, 2019, 9:16am

Hello, I also encountered the same problem with NaN. Can you find a solution?Please tell me, thank you very much！

CandyDan · April 29, 2019, 12:28pm

Same problem with Jetson Nano. I get sometimes values but most time NaN. Also the face_encoding examples from Dlib aren’t working.

AastaLLL · April 30, 2019, 3:24am

Hi,

We are working on this issue but still need some time.
Stay tuned.

AastaLLL · May 3, 2019, 7:00am

Hi,

We found this issue may comes from cudnn and is checking with our internal team now.
Thanks.

mgraves03 · May 5, 2019, 9:51pm

Glad to hear that it sounds like the developers are making progress on identifying this bug.

FWIW the memcheck error above appears to come from not running the utility as root.

I get no initialization errors after running the cuda-memcheck utility as root.

Using the code listed above, below was the expected and received results for the first entry in my numpy array:

Expected: -0.08488056
Received: 1.13017666e+18

I hope that helps.

AastaLLL · May 7, 2019, 2:11am

Hi,

We found a workaround to unblock this issue.
Please use basic cudnnConvolutionForward algorithm instead.

1. Download source

wget http://dlib.net/files/dlib-19.16.tar.bz2
tar jxvf dlib-19.16.tar.bz2

2. Apply this changes:

diff --git a/dlib/cuda/cudnn_dlibapi.cpp b/dlib/cuda/cudnn_dlibapi.cpp
index a32fcf6..6952584 100644
--- a/dlib/cuda/cudnn_dlibapi.cpp
+++ b/dlib/cuda/cudnn_dlibapi.cpp
@@ -851,7 +851,7 @@ namespace dlib
                         dnn_prefer_fastest_algorithms()?CUDNN_CONVOLUTION_FWD_PREFER_FASTEST:CUDNN_CONVOLUTION_FWD_NO_WORKSPACE,
                         std::numeric_limits<size_t>::max(),
                         &forward_best_algo));
-                forward_algo = forward_best_algo;
+                //forward_algo = forward_best_algo;
                 CHECK_CUDNN(cudnnGetConvolutionForwardWorkspaceSize( 
                         context(),
                         descriptor(data),

3. Build and install

mkdir build
cd build
cmake ..
cmake --build .
sudo python setup.py install

Our internal team keep checking the cuDNN issue and will let you know if any progress.
Thanks.

mgraves03 · May 7, 2019, 2:17am

I will give this a try and let you guys know if it works. Thank you guys for the help and swift response.

mgraves03 · May 7, 2019, 7:03am

Hey @AastaLLL

It appears that the patch you have provided works as a temporary solution.

Again using the sample code from earlier in the thread, below were my results from testing.

Expected: -0.08488056
Received: -0.08488055

Slight change in accuracy but that’s probably from using a different model in the DLIB library?

Will still be waiting for the patch when it comes out, but I can confirm that this works for an immediate solution.

For anyone else who would needs to use the workaround from above, ensure that you remove DLIB via pip before/after running the setup.py in the instructions above if you previously have it installed from such. If you do not, you will still get NaN and accuracy errors despite manually compiling and installing DLIB.

I can also confirm that pulling the current version of DLIB (19.17 at the moment) via git and applying this patch works.

Thank you @AastaLLL

Flaty · May 10, 2019, 2:38pm

Hello,

i tryed to install this patch, but i don’t know how to Apply this changes.
I am not a Linux expert. ;)

Thanks for advice

Flaty · May 10, 2019, 5:11pm

Deleted

Topic		Replies	Views
Tensorflow 2.x on Jetson nano Jetson Nano tensorflow	23	7173	October 18, 2021
Issue with dlib Jetson Nano python	4	2219	October 18, 2021
Weird error: RuntimeError: Error while calling cudnnConvolutionForward dlib/cuda/cudnn_dlibapi.cpp:1007. code: 7, reason: A call to cuDNN failed cuDNN	3	5010	March 9, 2022
Face Recognition Running Slow on Jetpack 4.4 Jetson Nano nvbugs	18	5434	October 15, 2021
Jetpack 4.4 Broke one of my programs Jetson Nano cudnn	24	3111	October 18, 2021
JetPack 4.6 Production Release with L4T 32.6.1 Jetson Nano	47	12027	March 10, 2022
Simple accelerated face recognition Jetson Xavier NX opencv , cuda	20	9145	October 18, 2021
OpenCV 4.2.0 and CuDNN for Jetson Nano? Jetson Nano opencv	56	10987	October 18, 2021
How to to install cuda 10.0 on jetson nano separately ? Jetson Nano	27	32005	October 14, 2021
NVIDIA Jetson Nano 2GB Developer Kit available now Jetson Nano	79	6337	March 10, 2022

issues with dlib library

Related topics