Hi all,
I’m trying to launch a federated training using Clara Train v4 but I cannot establish a server-client connection.
I generate the server and clients packages using the provision tool and the following project.yml file:
project.yml (2.3 KB)
I can launch the server correctly (as shown by the terminal output):
server.txt (441 Bytes)
However, when I try to launch the client and to connect it to the server, it fails:
client_error.txt (938 Bytes)
I launch the client’s and the server’s dockers with the following files:
server_docker.txt (583 Bytes)
client_docker.txt (753 Bytes)
Do you have any clues on could may be going wrong ?
Thanks in advance,
Gonzalo Quintana
Hi
It seems you cannot connect / reach MyServer
. I see your docker files map the host network. Is MyServer
the name of your machine ? I guess not so FL doesn’t know how to resolve this name. you could add it to the /etc/hosts
as
<123.32.2.4 yourRealip Not 127.0.0.1> MyServer
However before doing that I strongly recommend you go over all the FL notebooks that would get you started with FL from having clients and server in the same docker all the way to have clients and servers on different machines
{
"cells": [
{
"cell_type": "markdown",
"source": "# Running FL Admin to orchestrate an FL Experiment \n\nThis notebook will walk you through the work flow of FL admin who would conduct FL experiments. \nNote this is the only persona that has control over the FL experiments. \nThat is once server and clients have started, lead researcher can run Fl experiments using the CLI through the admin client.\nThe following types of commands are available:\n- Check system operating status\n- View system logs\n- Deploy MMARs (training configuration) to server and clients\n- Start, stop training\n- Clean up training results data (not the training datasets).\n- Shutdown, restart server or clients \n\nThis note book will walk you though how to perform commands above to complete an FL experiment. \n",
"metadata": {
"pycharm": {
"metadata": false
}
}
},
{
"cell_type": "markdown",
"source": "## Prerequisites\n- Ran [Provisioning Notebook](Provisioning.ipynb) and started the server.\n- (Optional) Looked at [Client Notebook](Client.ipynb). \n",
"metadata": {
"pycharm": {
"metadata": false
}
}
},
This file has been truncated. show original
All FL notebooks are at clara-train-examples/PyTorch/NoteBooks/FL at master · NVIDIA/clara-train-examples · GitHub
Hope this helps