The "check_status client" command on the admin side return no replies using Clara SDK federated learning

Hello, everyone,

I have just established a federated learning cluster using Clara SDK 4.0. It runs at a single machine (localhost) and includes one server and two clients (ikang-a & ikang-b). Everything goes well for a lot of experiments. However, suddenly, the admin side can not connect two clients anymore for some unknown reason. I have checked the config files and they are unchanged. It is very strange, because:

  1. Both of the clients reported they have registered the server successfully.

Successfully registered client:ikang-a for XXX. Got token:9dd0a987-3ad7-4175-b07e-c45fcbe1c7fa
Successfully registered client:ikang-b for XXX. Got token:97c4e45c-a555-43ec-979b-325978c01736

  1. The server has reported two clients have joined.

New client ikang-a@xxx.xxx.xxx.xxx joined. Sent token: 9dd0a987-3ad7-4175-b07e-c45fcbe1c7fa. Total clients: 1
New client ikang-b@xxx.xxx.xxx.xxx joined. Sent token: 97c4e45c-a555-43ec-979b-325978c01736. Total clients: 2

  1. “Check_status server” command on the admin side returned normally.

check_status server
FL run number has not been set.
FL server status: training not started
Registered clients: 2

CLIENT NAME | TOKEN | LAST ACCEPTED ROUND | CONTRIBUTION COUNT |

| ikang-a | 9dd0a987-3ad7-4175-b07e-c45fcbe1c7fa | | 0 |
| ikang-b | 97c4e45c-a555-43ec-979b-325978c01736 | | 0 |

  1. But “check_status client” returned “no replies”

check_status client
instance:ikang-a : No replies
instance:ikang-b : No replies

So starting training on the clients’ side failed.

I have no idea how to fix this problem.

The version of nvflare is: 0.1.4.
By the way, I tried to read the code of the nvflare to locate the problem. But It is pyc format. The current version of nvlare on github is 2.0.0+。Maybe I should upgrade nvflare?

Thanks for your help!

Steven