I would like to simulate federated learning using clara train sdk to make some experiments. Because I only have one GPU available, the clients need to share the GPU which means that the individual clients need to train one after the other and each client always needs to free all reserved GPU memory after one train_step is done.
I am not sure whether this is achievable using CLARA Train? Which modifications do I need to make?
fed_client.py or client_model_manager.py look like a promising candidates, but I would love to get some tipps from someone who knows the full code base whether this can work before trying to implement it.
Thank you for any Tipps