The exploitation of data streams, nowadays provided nonstop by a myriad of diverse applications, asks for specific analysis methods. In this paper, we propose SF-DBSCAN, a fuzzy version of the DBSCAN algorithm, aimed to perform unsupervised analysis of streaming data. Fuzziness is introduced by fuzzy borders of density-based clusters. We describe and discuss the proposed algorithm, which evolves the clusters at each occurrence of a new object. Three synthetic datasets are used to show the ability of SF-DBSCAN to successfully track changes of data distribution, thus properly addressing concept drift. SF-DBSCAN is compared with a basic, crisp streaming version of DBSCAN with regard to modelling effectiveness.
A Fuzzy Density-based Clustering Algorithm for Streaming Data
Bechini A.;Marcelloni F.
;Renda A.
2019-01-01
Abstract
The exploitation of data streams, nowadays provided nonstop by a myriad of diverse applications, asks for specific analysis methods. In this paper, we propose SF-DBSCAN, a fuzzy version of the DBSCAN algorithm, aimed to perform unsupervised analysis of streaming data. Fuzziness is introduced by fuzzy borders of density-based clusters. We describe and discuss the proposed algorithm, which evolves the clusters at each occurrence of a new object. Three synthetic datasets are used to show the ability of SF-DBSCAN to successfully track changes of data distribution, thus properly addressing concept drift. SF-DBSCAN is compared with a basic, crisp streaming version of DBSCAN with regard to modelling effectiveness.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.