Hello,
We are trying to implement Federated learning on brats_segmentation model as shown here: https://ngc.nvidia.com/models/ea-nvidia-clara-train:clara_pt_brain_mri_segmentation
The MMAR does not have the config_fed_client.json and config_fed_server.json.
We utilized and adapted these files from Clara Train 3.1 MMARs.
config_fed_server.json (1.5 KB) config_fed_client.json (719 Bytes)
The following error is received on starting server,
Starting Admin Server flc1 on Port 8003
Server has been started.
2021-02-21 16:06:00,581 - ClientManager - INFO - Client: New client org1-a@10.65.199.147 joined. Sent token: 8489aace-7fa6-48a8-bd30-4cf780fa7200. Total clients: 1
2021-02-21 16:06:19,481 - ClientManager - INFO - Client: New client org1-b@10.65.199.147 joined. Sent token: a71751ee-d132-4df9-a43c-0520bf50b6ce. Total clients: 2
Check server status.
Error processing config /workspace/startup/../run_20/mmar_server/config/config_train.json: local variable 'trainer' referenced before assignment
Traceback (most recent call last):
File "server/sai.py", line 368, in start_server_training
File "utils/wfconf.py", line 163, in configure
File "utils/wfconf.py", line 158, in configure
File "utils/wfconf.py", line 154, in _do_configure
File "apps/fed_learn/fl_conf.py", line 197, in finalize_config
UnboundLocalError: local variable 'trainer' referenced before assignment
FL server execution exception: local variable 'trainer' referenced before assignment
2021-02-21 16:07:53,145 - BaseServer - INFO - Stopping server training...
2021-02-21 16:07:53,146 - ServerModelManager - INFO - closing the model manager
2021-02-21 16:07:53,146 - BaseServer - INFO - Round time: 158 second(s).
Couple of questions:
Q1. What are the valid configurations for config_fed_server.json, config_fed_client.json in case of this brats_segmentation MMAR? How do we resolve the above error?
Q2. How do we decide which pre_processors and post_processors have to be used for a particular MMAR?
Thanks,
Siddharth