Linear regression and classification techniques are very common in statistical data analysis but they are often able to extract from data only linear models, which can be a limitation in real data context. Aim of this study is to build an innovative procedure to overcome this defect. Initially, a multiple linear regression analysis using the best-subset algorithm was performed to determine the variables for best predicting the dependent variable. Based on the same selected variables, Artificial Neural Networks were employed to improve the prediction of the linear model, taking advantage of their nonlinear modeling capability. Linear and nonlinear models were compared in their classification (ROC curves) and prediction (cross-validation) tasks: nonlinear model resulted to fit better data (36% vs. 10% variance explained for nonlinear and linear, respectively) and provided more reliable parameters for accuracy and misclassification rates (70% and 30% vs. 66% and 34%, respectively).
Artificial Neural Networks for Nonlinear Regression and Classification
LANDI, ALBERTO;PIAGGI, PAOLO;LAURINO, MARCO;MENICUCCI D.
2010-01-01
Abstract
Linear regression and classification techniques are very common in statistical data analysis but they are often able to extract from data only linear models, which can be a limitation in real data context. Aim of this study is to build an innovative procedure to overcome this defect. Initially, a multiple linear regression analysis using the best-subset algorithm was performed to determine the variables for best predicting the dependent variable. Based on the same selected variables, Artificial Neural Networks were employed to improve the prediction of the linear model, taking advantage of their nonlinear modeling capability. Linear and nonlinear models were compared in their classification (ROC curves) and prediction (cross-validation) tasks: nonlinear model resulted to fit better data (36% vs. 10% variance explained for nonlinear and linear, respectively) and provided more reliable parameters for accuracy and misclassification rates (70% and 30% vs. 66% and 34%, respectively).I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.