Hydrothermal carbonization (HTC) modelling through machine learning requires high-quality datasets with large size to adequately train the predictive algorithms. For energy-intensive processes like HTC, data collection may be highly expensive and time-consuming. Therefore, predictive models are usually trained using small datasets collected from literature. To overcome this limitation, we introduced controlled Gaussian noise into the training datasets, obtained experimentally, to significantly expand their size without distorting fundamental properties. Differently from the current HTC modelling approach and other data augmentation techniques, this novel method allows to obtain large and homogeneous datasets, in terms of consistency and reliability, leading to more robust predictions and maintaining a realistic process representation. After data augmentation, the developed models, based on artificial neural network (ANN) and support vector machine (SVM), performed under more realistic conditions and greatly benefited from training with enlarged datasets. ANN exhibited superior predictive capabilities compared to SVM, with a reduction of Mean Square Error greater than 90 %. Mean Absolute Percentage Errors were below 5 % for ANN and in the range 6–15 % for SVM. The proposed approach will contribute to enhance the algorithms’ predictive power by requiring limited experimental data and reducing research time and costs, accordingly

Improving hydrothermal carbonization prediction by machine learning: towards a more accurate and less expensive process modelling through data augmentation

Bartolomeo Cosenza;
2025-01-01

Abstract

Hydrothermal carbonization (HTC) modelling through machine learning requires high-quality datasets with large size to adequately train the predictive algorithms. For energy-intensive processes like HTC, data collection may be highly expensive and time-consuming. Therefore, predictive models are usually trained using small datasets collected from literature. To overcome this limitation, we introduced controlled Gaussian noise into the training datasets, obtained experimentally, to significantly expand their size without distorting fundamental properties. Differently from the current HTC modelling approach and other data augmentation techniques, this novel method allows to obtain large and homogeneous datasets, in terms of consistency and reliability, leading to more robust predictions and maintaining a realistic process representation. After data augmentation, the developed models, based on artificial neural network (ANN) and support vector machine (SVM), performed under more realistic conditions and greatly benefited from training with enlarged datasets. ANN exhibited superior predictive capabilities compared to SVM, with a reduction of Mean Square Error greater than 90 %. Mean Absolute Percentage Errors were below 5 % for ANN and in the range 6–15 % for SVM. The proposed approach will contribute to enhance the algorithms’ predictive power by requiring limited experimental data and reducing research time and costs, accordingly
2025
Cosenza, Bartolomeo; Picone, Antonio; Volpe, Maurizio; Messineo, Antonio
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11568/1319788
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact