Traditional clustering algorithms require data to be centralized on a single machine or in a datacenter. Due to privacy issues and traffic limitations, in several real applications data cannot be transferred, thus hampering the effectiveness of traditional clustering algorithms, which can operate only on locally stored data. In the last years a new paradigm has been gaining popularity: Federated Learning (FL). FL enables the collaborative training of data mining models and, at the same time, preserves data locally at the data owners’ places, decoupling the ability to perform machine learning from the need to transfer data. In this context, we propose the federated version of the popular fuzzy -means clustering algorithm. We first describe this version through pseudo-code and then demonstrate that the clusters obtained by the federated approach coincide with those generated by the classical algorithm executed on the union of all the local datasets. We also present an analysis on how privacy is preserved. Finally, we show some experimental results on the performance of the federated version when only a number of clients are involved in the clustering process.

A federated fuzzy c-means clustering algorithm

José Luis Corcuera Bárcena
Membro del Collaboration Group
;
Francesco Marcelloni;Alessandro Renda;Alessio Bechini;Pietro Ducange
2021-01-01

Abstract

Traditional clustering algorithms require data to be centralized on a single machine or in a datacenter. Due to privacy issues and traffic limitations, in several real applications data cannot be transferred, thus hampering the effectiveness of traditional clustering algorithms, which can operate only on locally stored data. In the last years a new paradigm has been gaining popularity: Federated Learning (FL). FL enables the collaborative training of data mining models and, at the same time, preserves data locally at the data owners’ places, decoupling the ability to perform machine learning from the need to transfer data. In this context, we propose the federated version of the popular fuzzy -means clustering algorithm. We first describe this version through pseudo-code and then demonstrate that the clusters obtained by the federated approach coincide with those generated by the classical algorithm executed on the union of all the local datasets. We also present an analysis on how privacy is preserved. Finally, we show some experimental results on the performance of the federated version when only a number of clients are involved in the clustering process.
File in questo prodotto:
File Dimensione Formato  
paper08.pdf

accesso aperto

Tipologia: Versione finale editoriale
Licenza: Creative commons
Dimensione 826.01 kB
Formato Adobe PDF
826.01 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11568/1123510
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? ND
social impact