Purpose: The combination of machine learning (ML) and life cycle-based methodologies is a promising strategy to overcome some of the most relevant issues regarding carbon footprint (CF) calculation/prediction. This research presents an approach and a software tool to apply ML techniques with the aim of solving issues related to data availability in the life cycle inventory (LCI) and improving the accuracy of carbon footprint predictions. The final purpose is to streamline the CF calculation process in the dairy cattle farming sector. Methods: The methodology used in this study consists of three steps: dataset creation, data optimization, and enhanced prediction. Initially, a dataset was compiled from primary sources and literature on LCA studies related to dairy cattle farming. This dataset contained missing data that could affect prediction accuracy. ML techniques were then applied to improve the quality of the data. A ML tool implementing 11 different regression algorithms was developed. This user-friendly tool, easily accessible also to non-ML experts, automatically optimizes data quality by determining the most appropriate algorithm for predicting missing values on a case-by-case basis. In the final step, the same tool was used on the optimized dataset to identify and use the best model for accurate CO2-eq emissions prediction. Results and discussion: An initial assessment was conducted to evaluate the accuracy of a ML model. The results of the test demonstrated that the Gaussian kernel regression model exhibited the highest performance among the available models, with a root mean square error (RMSE) of 18.87%. This value will be used as a threshold for the assessment of the efficacy of the described approach. The tool was then employed to optimize the dataset by predicting missing values. Lastly, the tool was utilized to predict CO2-eq emissions once more, but this time using the optimized dataset for training. As with the previous test, the Gaussian model exhibited the highest performance, but this time with a RMSE of 14.65%. The RMSE was 4% lower, thereby reflecting an increased accuracy in the prediction. Conclusions: The results demonstrate that the presented ML-based approach and tool are effective in predicting lacking inventory data, as well as in improving the accuracy and reliability of CO2-eq emissions prediction in life cycle-based studies. The developed tool makes ML more accessible to non-expert users and facilitates usability. This integration of ML and life cycle-based methodologies shows promising avenues for more accurate and efficient environmental impact assessments in agriculture.
Development of a machine learning tool for the enhancement of carbon footprint prediction for cattle milk production
Mantino, AlbertoData Curation
;Finocchi, MatteoData Curation
;Mele, MarcelloData Curation
;
2025-01-01
Abstract
Purpose: The combination of machine learning (ML) and life cycle-based methodologies is a promising strategy to overcome some of the most relevant issues regarding carbon footprint (CF) calculation/prediction. This research presents an approach and a software tool to apply ML techniques with the aim of solving issues related to data availability in the life cycle inventory (LCI) and improving the accuracy of carbon footprint predictions. The final purpose is to streamline the CF calculation process in the dairy cattle farming sector. Methods: The methodology used in this study consists of three steps: dataset creation, data optimization, and enhanced prediction. Initially, a dataset was compiled from primary sources and literature on LCA studies related to dairy cattle farming. This dataset contained missing data that could affect prediction accuracy. ML techniques were then applied to improve the quality of the data. A ML tool implementing 11 different regression algorithms was developed. This user-friendly tool, easily accessible also to non-ML experts, automatically optimizes data quality by determining the most appropriate algorithm for predicting missing values on a case-by-case basis. In the final step, the same tool was used on the optimized dataset to identify and use the best model for accurate CO2-eq emissions prediction. Results and discussion: An initial assessment was conducted to evaluate the accuracy of a ML model. The results of the test demonstrated that the Gaussian kernel regression model exhibited the highest performance among the available models, with a root mean square error (RMSE) of 18.87%. This value will be used as a threshold for the assessment of the efficacy of the described approach. The tool was then employed to optimize the dataset by predicting missing values. Lastly, the tool was utilized to predict CO2-eq emissions once more, but this time using the optimized dataset for training. As with the previous test, the Gaussian model exhibited the highest performance, but this time with a RMSE of 14.65%. The RMSE was 4% lower, thereby reflecting an increased accuracy in the prediction. Conclusions: The results demonstrate that the presented ML-based approach and tool are effective in predicting lacking inventory data, as well as in improving the accuracy and reliability of CO2-eq emissions prediction in life cycle-based studies. The developed tool makes ML more accessible to non-expert users and facilitates usability. This integration of ML and life cycle-based methodologies shows promising avenues for more accurate and efficient environmental impact assessments in agriculture.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


