Together with population ageing, the number of people suffering from multimorbidity is increasing, up to more than half of the population by 2035. This part of the population is composed by the highest-risk patients, who are, at the same time, the major users of the healthcare systems. The early identification of this sub-population can really help to improve people’s quality of life and reduce healthcare costs. In this paper, we describe a population health management tool based on state-of-the-art intelligent algorithms, starting from administrative and socio-economic data, for the early identification of high-risk patients. The study refers to the population of the Local Health Unit of Central Tuscany in 2015, which amounts to 1,670,129 residents. After a trade-off on machine learning models and on input data, Random Forest applied to 1-year of historical data achieves the best results, outperforming state-of-the-art models. The most important variables for this model, in terms of mean minimal depth, accuracy decrease and Gini decrease, result to be age and some group of drugs, such as high-ceiling diuretics. Thanks to the low inference time and reduced memory usage, the resulting model allows for real-time risk prediction updates whenever new data become available, giving General Practitioners the possibility to early adopt personalised medicine.

Trading-Off Machine Learning Algorithms towards Data-Driven Administrative-Socio-Economic Population Health Management

Panicacci, Silvia;Donati, Massimiliano;Fanucci, Luca
2021-01-01

Abstract

Together with population ageing, the number of people suffering from multimorbidity is increasing, up to more than half of the population by 2035. This part of the population is composed by the highest-risk patients, who are, at the same time, the major users of the healthcare systems. The early identification of this sub-population can really help to improve people’s quality of life and reduce healthcare costs. In this paper, we describe a population health management tool based on state-of-the-art intelligent algorithms, starting from administrative and socio-economic data, for the early identification of high-risk patients. The study refers to the population of the Local Health Unit of Central Tuscany in 2015, which amounts to 1,670,129 residents. After a trade-off on machine learning models and on input data, Random Forest applied to 1-year of historical data achieves the best results, outperforming state-of-the-art models. The most important variables for this model, in terms of mean minimal depth, accuracy decrease and Gini decrease, result to be age and some group of drugs, such as high-ceiling diuretics. Thanks to the low inference time and reduced memory usage, the resulting model allows for real-time risk prediction updates whenever new data become available, giving General Practitioners the possibility to early adopt personalised medicine.
2021
Panicacci, Silvia; Donati, Massimiliano; Profili, Francesco; Francesconi, Paolo; Fanucci, Luca
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11568/1067535
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 8
  • ???jsp.display-item.citation.isi??? 7
social impact