Federated Learning has witnessed an increasing popularity in the past few years for its ability to train Machine Learning models in critical contexts, using private data without moving them. Most of the approaches in the literature are focused on mobile environments, where mobile devices contain the data of single users, and typically deal with images or text data. In this paper, we define hcsfedavg, a novel federated learning approach tailored for training machine learning models on data distributed over federated organizations hierarchically organized. Our method focuses on the generalization capabilities of the neural network models, providing a new mechanism for selecting their best weights. In addition, it is tailored for tabular data. We empirically test the performance of our approach on two different tabular datasets, showing excellent results in terms of performance and generalization capabilities. Then, we also tackle the problem of assessing the privacy risk of users represented in the training data. In particular, we empirically show, by attacking the hcsfedavg models with the Membership Inference Attack, that the privacy of the users in the training data may have high risk.

A new approach for cross-silo federated learning and its privacy risks

Fontana, Michele;Naretto, Francesca;Monreale, Anna
2021-01-01

Abstract

Federated Learning has witnessed an increasing popularity in the past few years for its ability to train Machine Learning models in critical contexts, using private data without moving them. Most of the approaches in the literature are focused on mobile environments, where mobile devices contain the data of single users, and typically deal with images or text data. In this paper, we define hcsfedavg, a novel federated learning approach tailored for training machine learning models on data distributed over federated organizations hierarchically organized. Our method focuses on the generalization capabilities of the neural network models, providing a new mechanism for selecting their best weights. In addition, it is tailored for tabular data. We empirically test the performance of our approach on two different tabular datasets, showing excellent results in terms of performance and generalization capabilities. Then, we also tackle the problem of assessing the privacy risk of users represented in the training data. In particular, we empirically show, by attacking the hcsfedavg models with the Membership Inference Attack, that the privacy of the users in the training data may have high risk.
2021
978-1-6654-0184-5
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11568/1123041
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact