issues with dlib library

Hello, I also encountered the same problem with NaN. Can you find a solution?Please tell me, thank you very much!

Same problem with Jetson Nano. I get sometimes values but most time NaN. Also the face_encoding examples from Dlib aren’t working.

Hi,

We are working on this issue but still need some time.
Stay tuned.

Hi,

We found this issue may comes from cudnn and is checking with our internal team now.
Thanks.

Glad to hear that it sounds like the developers are making progress on identifying this bug.

FWIW the memcheck error above appears to come from not running the utility as root.

I get no initialization errors after running the cuda-memcheck utility as root.

Using the code listed above, below was the expected and received results for the first entry in my numpy array:

Expected: -0.08488056
Received: 1.13017666e+18

I hope that helps.

Hi,

We found a workaround to unblock this issue.
Please use basic cudnnConvolutionForward algorithm instead.

1. Download source

wget http://dlib.net/files/dlib-19.16.tar.bz2
tar jxvf dlib-19.16.tar.bz2

2. Apply this changes:

diff --git a/dlib/cuda/cudnn_dlibapi.cpp b/dlib/cuda/cudnn_dlibapi.cpp
index a32fcf6..6952584 100644
--- a/dlib/cuda/cudnn_dlibapi.cpp
+++ b/dlib/cuda/cudnn_dlibapi.cpp
@@ -851,7 +851,7 @@ namespace dlib
                         dnn_prefer_fastest_algorithms()?CUDNN_CONVOLUTION_FWD_PREFER_FASTEST:CUDNN_CONVOLUTION_FWD_NO_WORKSPACE,
                         std::numeric_limits<size_t>::max(),
                         &forward_best_algo));
-                forward_algo = forward_best_algo;
+                //forward_algo = forward_best_algo;
                 CHECK_CUDNN(cudnnGetConvolutionForwardWorkspaceSize( 
                         context(),
                         descriptor(data),

3. Build and install

mkdir build
cd build
cmake ..
cmake --build .
sudo python setup.py install

Our internal team keep checking the cuDNN issue and will let you know if any progress.
Thanks.

1 Like

I will give this a try and let you guys know if it works. Thank you guys for the help and swift response.

Hey @AastaLLL

It appears that the patch you have provided works as a temporary solution.

Again using the sample code from earlier in the thread, below were my results from testing.

Expected: -0.08488056
Received: -0.08488055

Slight change in accuracy but that’s probably from using a different model in the DLIB library?

Will still be waiting for the patch when it comes out, but I can confirm that this works for an immediate solution.

For anyone else who would needs to use the workaround from above, ensure that you remove DLIB via pip before/after running the setup.py in the instructions above if you previously have it installed from such. If you do not, you will still get NaN and accuracy errors despite manually compiling and installing DLIB.

I can also confirm that pulling the current version of DLIB (19.17 at the moment) via git and applying this patch works.

Thank you @AastaLLL

Hello,

i tryed to install this patch, but i don’t know how to Apply this changes.
I am not a Linux expert. ;)

Thanks for advice

Deleted

Hello,

i uninstalled dlib via

pip3 uninstall dlib

and commented the following line

forward_algo = forward_best_algo;

within the file “cudnn_dlibapi.cpp”.

Then i did step three.

But face_detection is still not working.

Did i do anything wrong?

Best regards
Markus

Hi,

The original issue is that face_recognition.face_encodings will return NAN results, not in face_detection.
Do you meet another issue?

Thanks.

Hello AastaLLL,

yes and if i don’t misunderstood the post #16, this is a workaround for this problem.

Correct?

Thanks for your support.

Hi, can you please explain how to make these changes. Which file should I open after downloading wget http://dlib.net/files/dlib-19.16.tar.bz2 and extracting. Where is this file present? Should I just open it using text editor and make changes? Sorry for asking these questions, if it seems silly to all of you, Its just I dont have much experience with linux. I just purchased a jetson nano board and the problem I am facing is that while installing dlib for face_recognition, running the final step-installing python extensions using “python3 setup.py” install always gets stuck at 75% or somewhere above that, the whole system hangs and I cannot even move my mouse pointer.Can you please guide me on what to do? Should I wait even if the system is not responding?

Hey sreehari.mi3,

it is good to read, that i am not the only one struggeling with this. :-)

I had similiar experience while the installation process, until i increased the swapfile to 4GB.

sudo dd if=/dev/zero of=/swapfile bs=1M count=4096
sudo chmod 600 /swapfile
sudo mkswap /swapfile
sudo swapon /swapfile

After this my installation problems where gone.

Best of luck :-)

If you have any information about applying this workaround mentioned in #16, please let me know.
Thx.

Hi Flaty,
Thanks for the suggeston of increasing swap file size and a special Thank you for including the code to do it in your reply. But I am using a 16gb card and I think it was due to this that I ran into some trouble while trying to increase swap file size. I have flashed jetpack onto a 32 GB card and I am going to try it out tomorrow. Did you use a 32gb card? Do you think this increase of swap file is necessary if I use a big card?

As for the workaround mentioned in #16, I tried to do it today. What I did was I opened Cuda Folder and opened “cudnn_dlibapi.cpp” file using a text editor(I couldnt find any a or b directories or whatever). Then I commented out //forward_algo = forward_best_algo; and saved the file.Thats what I could make out from what they have written above about making changes. Is that all, NVIDIA Team?Do I or anybody else need to do anything else? Can you please please confirm? Anybody having knowledge about this , please do reply? graves4life55?

Hi,

i use a 32GB card and had no problems by increasing the swapfile to 4GB.

I did the same as you and built dlib as mentioned, but the result is the same.
Face detections works, but no face recognition. I don’t get any errors.
I used the same code on my raspberry pi environment without any trouble.

I hope @AastaLLL can give us the right hint. :-)

@Sreehari.m13

After you edit the file, you need to compile the library by following step 3 in the instructions that AastaLLL provided. Just saving the file does not change anything.

I do not believe that the Jetson nano has swap enabled by default so you will need to add at least 4GB of swap space as mentioned in the other comment. This is due to the fact that it takes more memory than what’s available the nano to compile dlib. I’d recommend disabling the manually created swapspace after completing the compiling as using an SD card for swap allegedly is not good for the life of the SD card.

I mentioned this before but, I also had to uninstall the existing dlib via pip after manually compiling dlib. (I’d recommend doing this first)

This can be done via:

sudo pip uninstall dlib

It may not hurt to reinstall face_recognition after manually compiling dlib. Or uninstall it before compiling and reinstall after.

Another thing that may help with resource usage during compiling if the swap space is not enough is using another tty by pressing Ctrl+alt+f5 (as an example you can use almost any of the f keys to select a tty session).

This will take you to a CLI interface where you can login and compile without worrying about unity freezing from trying to render your window manager during compilation. One of the Ctrl alt f key combinations will take you back to your UI. I don’t recommend doing this during compiling.

Personally, I’d also recommend using a barrel jack power supply instead of microUSB as it’ll provide more power, as well as using as fast of an SD card as possible. I’ve seen issues with the device freezing quite a bit without using a minimum recommended class 10 SD card.

Hope that helps.

One other thing to take into consideration to determine whether or not your issue is extending from this bug is to print out your numpy array for the result you receive for the face_encodings function.

If you have large positive numbers or receive NaN for each of the numbers as the response, this is the indicator that you are having issues with this bug.

e.g.

f_encoding = face_recognition.face_encodings(known_image)[0]
print(f_encoding)

Expected result should be small decimals, not large integers or ‘NaN’

When I do step 3 I get a failure on the final bit. I am using python3 instead of python…is that an issue? Should I install pip for 2.7? EDIT: This happens on both 19.16 and 19.17…

sudo python3 setup.py install

Output (error message):

In file included from /home/drew/dlib-19.17/dlib/external/pybind11/include/pybind11/pytypes.h:12:0,
                 from /home/drew/dlib-19.17/dlib/external/pybind11/include/pybind11/cast.h:13,
                 from /home/drew/dlib-19.17/dlib/external/pybind11/include/pybind11/attr.h:13,
                 from /home/drew/dlib-19.17/dlib/external/pybind11/include/pybind11/pybind11.h:43,
                 from /home/drew/dlib-19.17/dlib/../dlib/python/pybind_utils.h:6,
                 from /home/drew/dlib-19.17/dlib/../dlib/python.h:6,
                 from /home/drew/dlib-19.17/tools/python/src/opaque_types.h:6,
                 from /home/drew/dlib-19.17/tools/python/src/dlib.cpp:4:
/home/drew/dlib-19.17/dlib/external/pybind11/include/pybind11/detail/common.h:111:10: fatal error: Python.h: No such file or directory
 #include <Python.h>

When using python(2.7) I get:

sudo python setup.py install
Traceback (most recent call last):
  File "setup.py", line 42, in <module>
    from setuptools import setup, Extension
ImportError: No module named setuptools