CINECA IRIS Institutional Research Information System

Classical data mining algorithms are considered inadequate to manage the volume, variety, velocity, and veracity aspects of big data. The advent of a number of open-source cluster-computing frameworks has opened new interesting perspectives for handling the volume and velocity features. In this context, thanks to their capability of coping with vague and imprecise information, distributed fuzzy models appear to be particularly suitable for handling the variety and veracity features of big data. Moreover, the interpretability of fuzzy models may assume a particular relevance in the context of big data mining. In this work, we propose a novel approach for generating, out of big data, a set of fuzzy rule–based classifiers characterized by different optimal trade-offs between accuracy and interpretability. We extend a state-of-the-art distributed multi-objective evolutionary learning scheme, implemented under the Apache Spark environment. In particular, we exploit a recently proposed distributed fuzzy decision tree learning approach for generating an initial rule base that serves as input to the evolutionary process. Furthermore, we integrate the evolutionary learning scheme with an ad hoc strategy for the granularity learning of the fuzzy partitions, along with the optimization of both the rule base and the fuzzy set parameters. Experimental investigations show that the proposed approach is able to generate fuzzy rule–based classifiers that are significantly less complex than the ones generated by the original multi-objective evolutionary learning scheme, while keeping the same accuracy levels.

Optimizing Partition Granularity, Membership Function Parameters, and Rule Bases of Fuzzy Classifiers for Big Data by a Multi-objective Evolutionary Approach

Barsacchi, Marco;Bechini, Alessio;Ducange, Pietro;Marcelloni, Francesco

2019-01-01

Abstract

Classical data mining algorithms are considered inadequate to manage the volume, variety, velocity, and veracity aspects of big data. The advent of a number of open-source cluster-computing frameworks has opened new interesting perspectives for handling the volume and velocity features. In this context, thanks to their capability of coping with vague and imprecise information, distributed fuzzy models appear to be particularly suitable for handling the variety and veracity features of big data. Moreover, the interpretability of fuzzy models may assume a particular relevance in the context of big data mining. In this work, we propose a novel approach for generating, out of big data, a set of fuzzy rule–based classifiers characterized by different optimal trade-offs between accuracy and interpretability. We extend a state-of-the-art distributed multi-objective evolutionary learning scheme, implemented under the Apache Spark environment. In particular, we exploit a recently proposed distributed fuzzy decision tree learning approach for generating an initial rule base that serves as input to the evolutionary process. Furthermore, we integrate the evolutionary learning scheme with an ad hoc strategy for the granularity learning of the fuzzy partitions, along with the optimization of both the rule base and the fuzzy set parameters. Experimental investigations show that the proposed approach is able to generate fuzzy rule–based classifiers that are significantly less complex than the ones generated by the original multi-objective evolutionary learning scheme, while keeping the same accuracy levels.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2019
			
	Codice DOI
	
				https://dx.doi.org/10.1007/s12559-018-9613-6
			
	Tutti gli autori
	
						Barsacchi, Marco; Bechini, Alessio; Ducange, Pietro; Marcelloni, Francesco
					
	Appare nelle tipologie:
	
				1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
Cognitive2018_POSTPRINT.pdf accesso aperto Descrizione: post-print version Tipologia: Documento in Post-print Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 806.15 kB Formato Adobe PDF Visualizza/Apri	806.15 kB	Adobe PDF	Visualizza/Apri
Barsacchi2019_Article_OptimizingPartitionGranularity.pdf solo utenti autorizzati Descrizione: official version from the journal website Tipologia: Versione finale editoriale Licenza: NON PUBBLICO - Accesso privato/ristretto Dimensione 1.59 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	1.59 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11568/943607

Citazioni

ND

18

15

social impact