Here’s a project that we’re using for neuroscience research: Realtime pupil and eyelid detection with DeepLabCut running on a Jetson Nano.
A short video of the setup is available here.
The interest on the research side is to get a realtime readout of animal or human cognitive states, as pupil size is an excellent indicator of attention, arousal, locomotion, and decision-making processes. As one of many example applications, you could use this setup to trigger a reward when the experimentee is alert.
The essential components on the hardware side are:
- FLIR (ex PointGrey) Chameleon3 USB3 camera https://www.flir.com/products/chameleon3-usb3/?model=CM3-U3-13Y3M-CS
- Ricoh 50mm f/1.4 VGA lens (FL-BC5014A-VG) https://industry.ricoh.com/en/fa_camera_lens/lens/vga/
- C to CS mount adapter and 12.5mm extension tube
- NVIDIA Jetson Nano
On the software side:
- JetPack for Jetson Nano I don't remember if this came with tensorflow included - if not, here are the instructions
- DeepLabCut (DLC) If you want to use DLC in real time, for now you'll need my realtime branch of the code: https://github.com/neurodroid/DeepLabCut/tree/realtime The change is rather trivial: instead of using cv2.VideoCapture to read from a file, you make it read from a camera. I've set up a virtualenv using the system's Python 3.6 (/usr/bin/python3) and have installed all required packages using pip within this virtualenv. DLC has a lot of version restrictions for its dependencies. I've emptied the install_requires list in DLC's setup.py, and have installed all required packages manually using pip within the virtualenv.
- aravis 0.6.4 We use version 0.6.4 of aravis to drive the usb3 camera on Linux. Versions >= 0.7 use a build system that's not in the bionic repositories.
- OpenCV 3.4.7 Built with many non-default options, and using the virtualenv with Python3.6 mentioned above. Not sure how to best share these - dump the CMakeVars.txt or the CMakeCache.txt files somewhere? Most importantly, you'll need to build opencv with aravis support. Moreover, if you'd like to software trigger the camera (i.e. trigger a capture from your code, rather than from some hardware input), you'll need this patch for OpenCV: https://github.com/opencv/opencv/pull/15714 . I've tried to get it integrated into opencv but failed to get my point across.
We’ve trained both resnet50 and MobileNet v2 (v2_0.35_224) as a network backbone. Using about 200 annotated training frames was enough to get robust pupil and eyelid detection in our case. To be clear, we do not train the network on the Jetson - we only use the Jetson to detect pupil and eye size with a trained network.
Without any optimizations, we get about 20fps throughput for a frame size of 160x128 using MobileNet v2. We’re detecting 8 points within each frame (4 poles of the pupil, 4 poles of the eye). There’s probably quite a bit of room for improvement.
Let me know if you have any questions.
Kudos to:
The DeepLabCut developers
The European Research Council
Institut Pasteur