Classical data mining algorithms are considered inadequate to manage the volume, variety, velocity, and veracity aspects of big data. The advent of a number of open-source cluster-computing frameworks has opened new interesting perspectives for handling the volume and velocity features. In this context, thanks to their capability of coping with vague and imprecise information, distributed fuzzy models appear to be particularly suitable for handling the variety and veracity features of big data. Moreover, the interpretability of fuzzy models may assume a particular relevance in the context of big data mining. In this work, we propose a novel approach for generating, out of big data, a set of fuzzy rule–based classifiers characterized by different optimal trade-offs between accuracy and interpretability. We extend a state-of-the-art distributed multi-objective evolutionary learning scheme, implemented under the Apache Spark environment. In particular, we exploit a recently proposed distributed fuzzy decision tree learning approach for generating an initial rule base that serves as input to the evolutionary process. Furthermore, we integrate the evolutionary learning scheme with an ad hoc strategy for the granularity learning of the fuzzy partitions, along with the optimization of both the rule base and the fuzzy set parameters. Experimental investigations show that the proposed approach is able to generate fuzzy rule–based classifiers that are significantly less complex than the ones generated by the original multi-objective evolutionary learning scheme, while keeping the same accuracy levels.

Optimizing Partition Granularity, Membership Function Parameters, and Rule Bases of Fuzzy Classifiers for Big Data by a Multi-objective Evolutionary Approach

Barsacchi, Marco;Bechini, Alessio;Ducange, Pietro;Marcelloni, Francesco
2019-01-01

Abstract

Classical data mining algorithms are considered inadequate to manage the volume, variety, velocity, and veracity aspects of big data. The advent of a number of open-source cluster-computing frameworks has opened new interesting perspectives for handling the volume and velocity features. In this context, thanks to their capability of coping with vague and imprecise information, distributed fuzzy models appear to be particularly suitable for handling the variety and veracity features of big data. Moreover, the interpretability of fuzzy models may assume a particular relevance in the context of big data mining. In this work, we propose a novel approach for generating, out of big data, a set of fuzzy rule–based classifiers characterized by different optimal trade-offs between accuracy and interpretability. We extend a state-of-the-art distributed multi-objective evolutionary learning scheme, implemented under the Apache Spark environment. In particular, we exploit a recently proposed distributed fuzzy decision tree learning approach for generating an initial rule base that serves as input to the evolutionary process. Furthermore, we integrate the evolutionary learning scheme with an ad hoc strategy for the granularity learning of the fuzzy partitions, along with the optimization of both the rule base and the fuzzy set parameters. Experimental investigations show that the proposed approach is able to generate fuzzy rule–based classifiers that are significantly less complex than the ones generated by the original multi-objective evolutionary learning scheme, while keeping the same accuracy levels.
2019
Barsacchi, Marco; Bechini, Alessio; Ducange, Pietro; Marcelloni, Francesco
File in questo prodotto:
File Dimensione Formato  
Cognitive2018_POSTPRINT.pdf

accesso aperto

Descrizione: post-print version
Tipologia: Documento in Post-print
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 806.15 kB
Formato Adobe PDF
806.15 kB Adobe PDF Visualizza/Apri
Barsacchi2019_Article_OptimizingPartitionGranularity.pdf

solo utenti autorizzati

Descrizione: official version from the journal website
Tipologia: Versione finale editoriale
Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 1.59 MB
Formato Adobe PDF
1.59 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11568/943607
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 16
  • ???jsp.display-item.citation.isi??? 13
social impact