Federated server¶
In this tutorial, we will guide you on how to use the Federated Learning (FL) server in the AI4OS platform to perform a FL training.
For more information, see the Getting Started step by step guide available in the federated server repository, as well as the tutorial on using Federated Learning within the AI4OS Platform.
Deploying a Federated server¶
The workflow for deploying a FL server is similar to the one for deploying a module.
In this particular case, you will need to pay attention to:
The service deployed: When configuring the deployment of the FL server, we recommend selecting
JupyterLab
orVS Code
as service to run if you want to monitor the process. If you selectfedserver
, the FL server will be started automatically, but you will not be able to monitor the process (e.g. if there is a failure, how the clients are connected or if any of them is disconnected).The Docker tag: In the first configuration step you must select the docker tag. Note that the tag
tokens
will deploy the federated server with authentication enabled between the server and the clients (more info in the next sections).The Federated configuration: The last section (
Federated configuration
) will let you choose specific configuration for the FL training server like:how many rounds you will train,
the minimum number of clients,
the federated aggregation methods and the metric(s) analyzed,
etc.
Federated learning training in AI4EOSC¶
Starting the Federated Learning server¶
Note
This step is not needed if you configured the deployment to run with the fedserver
option.
If you deployed with JupyterLab/VScode, open the IDE and start the fedserver process:
$ cd federated-server/fedserver
$ python3 server.py
If you want to change any parameters in the federated configuration, you can
always modify fedserver/server.py
.
Retrieve the configuration¶
Now that your fedserver is running, you need to do the following steps:
Find the endpoint where your server is deployed:
Once your FL server is running, go back to the Dashboard, find your deployment, click on
Info
and copy the URL offedserver
endpoint.Find the secret token of your deployment:
Note
This step is only needed if you selected the
tokens
Docker tag during configuration.AI4OS provides users with a token-based system that can be used for authenticating the clients prior to their incorporation into the federated training.
To access the secret token, find your deployments and click the icon. You can generate as many tokens as needed (eg. 1 token per client), as well as revoke them:
Share them with the clients:
Note
This step is only needed if you selected the
tokens
Docker tag during configuration.You will need to share the endpoint and the appropriate token with the clients that will take part in the training. In the section below we will explain how the clients can use them to connect to the training.
Connecting the clients¶
In order to connect the clients to the FL server deployed within the platform, two approaches can be followed depending on where the clients are running:
- Clients running locally on the user’s resources or on servers external to the platform.
This is the most classic approach as in general in a FL training the data should not be leave the server where they are stored for training. Note that in most cases privacy restrictions are applied on them that prevent their centralization. Thus, in order to connect each client to the server, the clients must know the UUID of the deployment where the FL server is deployed as well as the datacenter on which it is running (IFCA or IISAS). Then, you can add the call_credentials parameter if the server has been created using tokens, as will be explained in the following section.
In this line, each client can connect to the server as follows:
import certifi # Start -> connecting with the server uuid = "*********************" # UUID of the deployment with the FL server (dashboard) data_center = "****" # The value for the data center can be ifca or iisas (lowercase) end_point = f"ide-{uuid}.{data_center}-deployments.cloud.ai4eosc.eu" fl.client.start_client( server_address=f"{endpoint}:443", client=Client(), root_certificates=Path(certifi.where()).read_bytes(), )
- Clients running on different deployments of the platform.
If you are running your clients from different deployments created in the platform, in orde to connect to the server you have to first find the IP of the server form the server side. In this line, you first go to the deployment in which you have started the server, open a terminal an run:
env | grep NOMAD_HOST_ADDR_fedserver
This will provide the IP and the port in which the FL server is running.
Then, from the client side, you can start the client as follows (again, you can add the call_credentials parameter if needed), introducing the IP and port from the server side as server_address:
# Start -> connecting with the server server_host = "*********************" # FILL IN WITH THE SERVER IP AND PORT FOR FL (server side) fl.client.start_client( server_address=server_ip, client=Client() )
Client-server authentication¶
In the AI4OS project, we use a custom fork of the flower library to perform FL trainings.
In the code below, we provide an example on how to integrate the previously obtained token and endpoint into the client code. More examples are available here.
import flwr as fl
from pathlib import Path
import certifi
import ai4flwr.auth.bearer
# Read the data, create the model
# (...)
# Create the class Client(), example of Flower client:
class Client(fl.client.NumPyClient):
def get_parameters(self, config):
return model.get_weights()
def fit(self, parameters, config):
model.set_weights(parameters)
model.fit(x_train, y_train, epochs=5, batch_size=16)
return model.get_weights(), len(x_train), {}
def evaluate(self, parameters, config):
model.set_weights(parameters)
loss, accuracy = model.evaluate(x_test, y_test)
return loss, len(x_test), {"accuracy": accuracy}
token = "*********************" # INCLUDE THE TOKEN GENERATED IN THE DASHBOARD
auth_plugin = ai4flwr.auth.bearer.BearerTokenAuthPlugin(token)
# Start -> connecting with the server
endpoint = "*********************" # FILL IN WITH THE ENDPOINT (dashboard) OR THE SERVER ADDRESS
fl.client.start_client(
server_address=f"{endpoint}:443",
client=Client(),
root_certificates=Path(certifi.where()).read_bytes(),
call_credentials=auth_plugin.call_credentials()
)
If you didn’t selected token authentication, feel free to remove the
call_credentials
parameter in the start_client()
function.
Server side differential privacy¶
DP states that an algorithm is differentially private if by viewing its result an adversary cannot know whether a particular individual’s data is included in the database used to achieve such result. This can be achieved by adding controled noise using different mechanisms, such us Laplace, Exponential, Gaussian, etc. We can use the privacy budget for controlnig the amount of noise, i.e. the level of privacy and the utility of the data.
In case that you want to start a FL server and include more privacy restrictions when building the global aggregated model, you can add differential privacy (DP) from the server side. Specifically, you can perform this step from the FL configuration when creating the server. You will need to include the noise multiplier for the Gaussian Mechanism, the clipping norm and the number of clients sampled. Note that this functionality is compatible with each of the aggregation strategies available in the platform. It’s important to note that in this case the noise multiplier is not the privacy budget, but here a greater value of the noise multiplier implies more privacy restrictions (more noise) and less utility. This allows to ensure central DP from the server-side when building the global model with fixed clipping.