Deploying FL on multiple computers with NVFlare

I am trying to run NVFlare as a realistic setup with multiple computers. After the provisioning steps, I ran the server and clients, admin by startup package. The sever is started but the client and admin computers yielded the communication error (grpc communication error).

2022-01-05 21:37:08,624 - Communicator - ERROR - Action: client_registration grpc communication error. retry: 1500, First start till now: 0.0013239383697509766 seconds.
2022-01-05 21:37:08,624 - Communicator - ERROR - Could not connect to server: ourserverdomain:8765 Setting flag for stopping training. failed to connect to all addresses

I try listing up the listening ports on the server by the nmap and it showed up 127.0.1.1:8002 which means the server is listening only to the localhost but not another computer. This makes me wonder whether the current NVFlare support running realistic scenario or only POC (prove of concept) ? Please help me to solve this problem, thank you.

Thanks for your interest in NVIDIA FLARE, and welcome to the forums!

When provisioning a realistic setup with multiple computers, the server name defined in project.yml should be the fully qualified domain name (FQDN) where the server can be reached via DNS. See the default project.yml linked below. You would replace “example.com” with your server’s domain name.
Provisioning in NVIDIA FLARE — NVIDIA FLARE 2.0 documentation

If you do not have a DNS entry for your server, you can use the server hostname. In this case, you need to add an entry to the client and admin /etc/hosts file to associate the server’s IP address with this hostname:

<server IP address> <server hostname>

This will allow you to connect from client to server using just the hostname rather than a fully qualified domain name.

Please let me know if you have any questions - happy to help troubleshoot!

-Kris