Traditional clustering algorithms require data to be centralized on a single machine or in a datacenter. Due to privacy issues and traffic limitations, in several real applications data cannot be transferred, thus hampering the effectiveness of traditional clustering algorithms, which can operate only on locally stored data. In the last years a new paradigm has been gaining popularity: Federated Learning (FL). FL enables the collaborative training of data mining models and, at the same time, preserves data locally at the data owners’ places, decoupling the ability to perform machine learning from the need to transfer data. In this context, we propose the federated version of the popular fuzzy -means clustering algorithm. We first describe this version through pseudo-code and then demonstrate that the clusters obtained by the federated approach coincide with those generated by the classical algorithm executed on the union of all the local datasets. We also present an analysis on how privacy is preserved. Finally, we show some experimental results on the performance of the federated version when only a number of clients are involved in the clustering process.

A federated fuzzy c-means clustering algorithm

José Luis Corcuera Bárcena
Membro del Collaboration Group
;
Francesco Marcelloni;Alessandro Renda;Alessio Bechini;Pietro Ducange
2021-01-01

Abstract

Traditional clustering algorithms require data to be centralized on a single machine or in a datacenter. Due to privacy issues and traffic limitations, in several real applications data cannot be transferred, thus hampering the effectiveness of traditional clustering algorithms, which can operate only on locally stored data. In the last years a new paradigm has been gaining popularity: Federated Learning (FL). FL enables the collaborative training of data mining models and, at the same time, preserves data locally at the data owners’ places, decoupling the ability to perform machine learning from the need to transfer data. In this context, we propose the federated version of the popular fuzzy -means clustering algorithm. We first describe this version through pseudo-code and then demonstrate that the clusters obtained by the federated approach coincide with those generated by the classical algorithm executed on the union of all the local datasets. We also present an analysis on how privacy is preserved. Finally, we show some experimental results on the performance of the federated version when only a number of clients are involved in the clustering process.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11568/1123510
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? ND
social impact