Homomorphic Encryption with Monai Trainer

Hi there,
I am trying to add Homomorphic Encryption to a Federated Learning Project using the MONAI Trainer. I used the BYOT_monai project as a reference for using the MONAI Trainer within Clara 4.0 and used the HE project to add homomorphic encryption. Both concepts work as standalone versions ( Clara 4.0 Trainer + HE and MONAI Trainer without HE), but when I combine them, I get the following error:

Send model to server.
2021-07-19 09:35:49,712 - FederatedClient - INFO - Starting to push model.
2021-07-19 09:35:49,721 - HEModelEncryptor - INFO - weighting client_india by aggregation weight 1.0
Traceback (most recent call last):
  File "<nvflare-0.1.4>/nvflare/private/fed/client/fed_client.py", line 229, in admin_run
  File "<nvflare-0.1.4>/nvflare/private/fed/client/fed_client.py", line 178, in run_federated_steps
  File "<nvflare-0.1.4>/nvflare/private/fed/client/fed_client.py", line 135, in federated_step
  File "<nvflare-0.1.4>/nvflare/private/fed/client/fed_client_base.py", line 217, in push_models
  File "/opt/conda/lib/python3.8/multiprocessing/pool.py", line 364, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "/opt/conda/lib/python3.8/multiprocessing/pool.py", line 771, in get
    raise self._value
  File "/opt/conda/lib/python3.8/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/opt/conda/lib/python3.8/multiprocessing/pool.py", line 48, in mapstar
    return list(map(*args))
  File "<nvflare-0.1.4>/nvflare/private/fed/client/fed_client_base.py", line 161, in push_remote_model
  File "<nvflare-0.1.4>/nvflare/private/fed/client/communicator.py", line 292, in submitUpdate
  File "<nvflare-0.1.4>/nvflare/private/fed/client/data_assembler.py", line 45, in get_contribution_data
  File "<nvflare-0.1.4>/nvflare/experimental/homomorphic_encryption/he_model_encryptor.py", line 156, in process
AssertionError: FL context does not have local iterations for weighting!

Hello and thank you for your interest in Clara FL and HE. This MONAI based trainer is currently not providing the local iterations as part of the Shareable meta information. This is needed to compute the weight for the weighted sum in the FedAvg algorithm. As a workaround, you can put the following line in the generate_shareable(self, train_ctx: TrainContext, fl_ctx: FLContext) function (here):

meta_data[FLConstants.NUM_STEPS_CURRENT_ROUND] = n_iter

Here, n_iter should correspond to the local number of training steps executed at this client in the current FL round. For testing purposes, you can set it to n_iter=1 (equal weights for each client) or get the number of iterations from the trainer engine. This example BYO trainer will be updated accordingly later on.

Hi @hroth3hm8y ,
thank you so much for your help.
The workaround does its job and the error is resolved!

However, I now get the error that the weights don’t match the architecture of my model:

  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1215, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for BasicUNet:
        size mismatch for conv_0.conv_0.conv.weight: copying a param with shape torch.Size([864]) from checkpoint, the shape in current model is torch.Size([32, 3, 3, 3]).

When I am using the Clara Trainer, I am adding a HE model_reader_writer:

      "model_reader_writer": {
          "path": "nvflare.experimental.homomorphic_encryption.he_pt_model_reader_writer.HEPTModelReaderWriter"

For the Monai Trainer, I am not using a different reader. Is this where the error comes from?
If so, do you have any idea how I could add it?

Correct. The custom model_reader_writer is needed to reshape the decrypted vectors back to the shape of the original tensors.

You can add the reshape operation here as

local_var_dict[var_name] = torch.as_tensor(np.reshape(weights, local_var_dict[var_name].shape))