I am developing a project idea using the Jetson Orin NX 8GB (or 16GB) system. This system will be put into a system architecture where federated learning is performed. The idea is to perform quantisation-aware training on the Orin. The model architecture is a resource-constrained CNN comprised. According to the paper, the CNN model has 26 K parameters and requires 190 K computations per second. The model was trained on Nvidia GTX 1080 TI GPUs with 12 GB of RAM in the paper. The Jetson Orin NX will be the client. For the server (and a second client), a GTX 1070 GPU is used (both, the client and server are on one system).
My question is this feasible, and what do I need to take further into account in developing a federated system using a SBC?
The idea is to build a (low-cost) proof of concept system. On one side, the Jetson Orin (client) and on the other side, a PC (client and server) with a GPU (1080Ti). We will perform quantisation-aware training on both the Jetson Orin NX and PC.
This minimal system is mainly for exploring what is possible/and limitations using time series as data (healthcare application) in a federated setup using NVIDIA FLARE. The idea is Jetson Orin NX 8GB (or even 4GB). We are aware that we potentially will run into the platform’s limitations.
(Note: a future next exploration step is placing two/three Jetson Orin in a federated setup.)
It looks like FLARE is tested on the desktop environment.
We need to check with our internal team to see if it can work on Jetson which is an ARM-based system.