My team and I are trying to do federated learning using flower on a Jetson Nano, but have been unable to run using GPUs. We’ve tried the docker containers from the flower github (https://github.com/adap/flower/tree/main/examples/embedded-devices), which could not successfully be built and run. We also tried the docker container suggested for machine learning (NVIDIA L4T ML | NVIDIA NGC), but it only runs on CPU, and will not run on GPU.
Is there a docker or set of instructions somewhere that will run federated learning on GPUs using flower on a Jetson Nano? We have been trying various methods for weeks unsuccessfully. Thanks for any help you can provide!
Hi,
Do you add the --runtime nvidia flag when launching the container?
This will mount some essential system lib to allow you access to the GPU.
Thanks.
We do - the full command we’re using to launch the ML container is:
sudo docker run -it --rm --runtime nvidia --network host nvcr.io/nvidia/l4t-ml:r32.7.1-py3
Hi,
GPU should work with the command.
Which lib are you using for testing?
Could you try if the CUDA sample can work within the container?
Thanks.
Good morning,
We are able to run GPUs, but not with the flower package installed. We’ve been able to either run federated learning through flower on CPUs only, or run GPUs but without being able to install flower>1.0. We’ve been trying both pytorch and tensorflow.
Thanks!
Hi, it’s unclear if the issue is with the Jetson Nano or with Flower itself. I found installation instructions the Xavier NX here, which should be similar to the original Nano: Federated Learning on Embedded Devices with Flower - Flower Examples 1.6.0
How far through these steps are you able to progress?
We’ve been able to successfully get through all the steps on a Jetson Xavier NX, but we’re still running into issues on the Nano. We’re going to try loading the Jetpack 5.1.2 image onto the Nano again today and see how that goes. Will update this thread with results.
Update: we’ve been able to run the nvidia docker (l4t-pytorch:r32.7.1-pth1.10-py3) and install flower, and it starts running with gpus. However, now once it connects to the server, it instantly closes the connection and gives us the following error:
Hi,
Do you apply the same steps on XavierNX and Nano?
If yes, there might be some packages that are not working on the Nano environment since it is Ubuntu 18.04.
Would you mind checking if any dependencies for Flower cannot be met first?
The message indicates some non-implemented functions.
Thanks.
Hi!
We have tried applying the same steps on both devices. I think the conclusion we’ve reached is that in order to install the version of flower we needed to use, we needed a newer Jetpack version than was supported on the Jetson Nano. Ultimately, we ended up switching to a MQTT-based communication scheme and wrote the code manually instead of using the flower framework.
Thanks for all the help!