Google MediaPipe Real-Time Hand Tracking on Nano

This would be great to test on Nano. Has anyone done this yet? TIA.

I am also greatly interested in doing it.

Does mediapipe work with nano?

Installation is a problem bazel is not supported with Jetson.

I have managed to install Bazel on the jetson. If you are interested feel free to msg me, so i can share the compiled version and further instructions.

I really would like to try mediapipe on the jetson or see other people trying it out.

By the way, did you try other similar pipeline frameworks on the jetson?

Frederic, I’ve been busy turning my Nano into an autonomous JetBot, but I have a spare Xavier that would be awesome for this. I can be reached at Dennis[at]faucher[dot]net. Thanks for sharing your progress. I’ll share mine as well.

I’ve managed to build mediapipe on my nano, and the hand tracking demo works quite well with the gpu. I have not, as of yet, been able to modify the examples since bazel to me is new.

Follow https://github.com/google/mediapipe/blob/master/mediapipe/docs/install.md#installing-on-debian-and-ubuntu, with the following caveats:

  1. bazel - used 1.2.1 as recommended in the mediapipe docs and followed https://docs.bazel.build/versions/master/install-compile-source.html#bootstrap-bazel
  2. I used my own version of opencv, so I followed the instructions for that (modifying WORKSPACE and opencv_linux.BUILD) - hint: new_local_repository goes in WORKSPACE and cc_library in opencv_linux.BUILD
  3. glog - https://github.com/google/mediapipe/issues/304 - you can use the config.guess and config.sub linked to at the bottom of https://github.com/google/mediapipe/issues/470 and put them in the cache as described in issue 304 (first link]
  4. compiling examples - I just hacked up my /usr/include/EGL/eglplatform.h (after saving the original) to take the correct ifdef path and avoid the conflict between TFLites GPU Status and the X11 Status as discussed in https://github.com/google/mediapipe/issues/305
  5. If I recall correctly, at some point I got an error while compiling one of the examples and I restored my eglplatform.h to its original state. (Sorry, I didn’t take notes because I didn’t really expect it to work)
  6. Run demos - I followed https://github.com/google/mediapipe/blob/master/mediapipe/docs/examples.md rather than the readme in the examples directory and they mostly ran.

Hope this helps.

1 Like

@mario.papini
did you manage to read directly from CSI sensor?
Reference Mediapipe

No, didn’t try. Used a webcam but with my own v4l-based code (i.e., not with cv::VideoCapture), since I need better control of all of the camera parameters and to share the camera with another thread, so it “should” work.

At this stage, I have the whole thing running under Qt, with one thread handling the camera, one thread being mediapipe with gesture recognition and another thread being dlib with facial recognition (and obviously other threads for GUI etc.). I use Qt’s signal / slot mechanism as the IPC to get all of the data from mediapipe to the gui. For now I still use mediapipe’s renderers, but that’s just for debug - I’ll eliminate those once I feel the system is more “stable”.

As an FYI, combining mediapipe and qt is not simple (took me a week of cursing), but surprisingly, the nano handles the big ball of code rather well, with hand tracking at 15+fps and facial recognition at 7ish fps. The image resolution is 720p, but I only pass (640x480) of the main image to each algorithm (I actually downscale the facial recognition as well). After adding some temporal filters, the gesture recognition works rather well with much less noise (starting point is https://github.com/TheJLifeX/mediapipe/tree/master/hand-gesture-recognition.

I digress, back to your original question, it’s my understanding that you can use v4l2 to access the csi camera as well - for example running “v4l2-ctl --list-formats-ext” from the cli works (once you’ve installed it via “sudo apt install v4l-utils”).

Hope all is well - Mario.

Hi,
You can follow this tutorial:

Eran

3 Likes