Nvidia FLARE and Kubernetes (K8s)

Hello everyone, I have recently came across NVIDIA FLARE framework, and I like it. Unfortunately, when I was trying to simulate an experiment with NVIDIA FLARE on kubernetes (k8s), I faced “CrashLoopBackOff” error from k8s. Also, I could not find any promising results on google.

Additional Info: I was trying this on my private cluster.

Also, I suspect that “CrashLoopBackoff” is due to the fl process was in sleeping state instead of running state. To be confident, I have deployed a hello world flask app along with fl_server using the same pod and by exposing an extra port. This time, there is no “CrashLoopBackoff” error, and I deduce that this is because there is at least one process (Flask App) which is in running state. Similarly, I have followed the same workaround for the client pods too, and it worked. Now, the clients authenticated to server, but I could not perform federation using FLAdminAPI. upload_app method returned “{“details”:{“message”:“Exception: Failed to communicate with Admin Server admin on 3002: [Errno 2] No such file or directory”}” response. On top of this, all this has been merely trail and error but could not find any promising resources that explain how to use Nvidia FLARE with k8s, so I am hoping for some resources.