A study on the application of instance selection techniques in genetic fuzzy rule-based classification systems: Accuracy-complexity trade-off

Fazzolari, Michela; Giglio, Bruno; Alcala', Rafael; Marcelloni, Francesco; Herrera, Francisco

doi:10.1016/j.knosys.2013.07.011

In the framework of genetic fuzzy systems, the computational time required by genetic algorithms for generating fuzzy rule-based models from data increases considerably with the increase of the number of instances in the training set, mainly due to the fitness evaluation. Also, the amount of data typically affects the complexity of the resulting model: a higher number of instances generally induces the generation of models with a higher number of rules. Since the number of rules is considered one of the factors which affect the interpretability of the fuzzy rule-based models, large datasets generally bring to less interpretable models. Both these problems can be tackled and partially solved by reducing the number of instances before applying the evolutionary process. In the literature several algorithms of instance selection have been proposed for selecting instances without deteriorating the accuracy of the generated models. The aim of this paper is to analyze the effectiveness of 36 training set selection methods when combined with genetic fuzzy rule-based classification systems. Using 37 datasets of different sizes we show that some of these methods can considerably help to reduce the computational time of the evolutionary process and to decrease the complexity of the fuzzy rule-based models with a very limited decrease of their accuracy with respect to the models generated by using the overall training set.