Background: Water intake is vital for health, yet the determinants of preformed water consumption in adults are poorly understood. Objectives: The aim of this study was to apply machine learning (ML) models to identify factors associated with preformed water intake, defined as water ingestion from plain water, other beverages, and food. Methods: This secondary analysis used baseline data from 219 participants in the Comprehensive Assessment of Long-term Effects of Reducing Intake of Energy (CALERIE™) 2 trial, a randomized controlled trial with extensive measures of body composition, energy expenditure, and dietary, physiological, psychological, and biomarker variables in healthy adults without obesity. Habitual intake of preformed water was quantified using deuterium and oxygen-18 isotope data obtained during two consecutive 14-day doubly labeled water measurement periods of weight stability. We developed models using linear regression, tree-based models (random forest, gradient boosting, extreme gradient boosting), and penalized regression models (ridge, lasso, elastic net) to identify factors associated with preformed water intake. Results: Based on root mean squared error, the ridge regression model using 25 variables was the best and explained 38% of the variance in preformed water intake. Higher preformed water intake was associated with higher intake of dietary fiber, protein, alcohol, total weight of food ingested, and lower intake of carbohydrate and sodium. Higher preformed water intake also was associated with lower percent body fat and higher fat free mass and total energy expenditure. Notably, ML models identified alcohol and potassium intake as important predictors that were not selected by traditional linear regression, underscoring their ability to capture nuanced relationships. Conclusions: These results demonstrate that data-driven ML models using a complex dataset can identify features and patterns associated with an important nutrient that might be missed using traditional statistical approaches and could be used to identify individuals at risk of inadequate hydration. Clinical trial registration: NCT00427193, clinicaltrials.gov.
Use of machine learning to identify determinants of habitual preformed water intake
Piaggi, Paolo;
In corso di stampa
Abstract
Background: Water intake is vital for health, yet the determinants of preformed water consumption in adults are poorly understood. Objectives: The aim of this study was to apply machine learning (ML) models to identify factors associated with preformed water intake, defined as water ingestion from plain water, other beverages, and food. Methods: This secondary analysis used baseline data from 219 participants in the Comprehensive Assessment of Long-term Effects of Reducing Intake of Energy (CALERIE™) 2 trial, a randomized controlled trial with extensive measures of body composition, energy expenditure, and dietary, physiological, psychological, and biomarker variables in healthy adults without obesity. Habitual intake of preformed water was quantified using deuterium and oxygen-18 isotope data obtained during two consecutive 14-day doubly labeled water measurement periods of weight stability. We developed models using linear regression, tree-based models (random forest, gradient boosting, extreme gradient boosting), and penalized regression models (ridge, lasso, elastic net) to identify factors associated with preformed water intake. Results: Based on root mean squared error, the ridge regression model using 25 variables was the best and explained 38% of the variance in preformed water intake. Higher preformed water intake was associated with higher intake of dietary fiber, protein, alcohol, total weight of food ingested, and lower intake of carbohydrate and sodium. Higher preformed water intake also was associated with lower percent body fat and higher fat free mass and total energy expenditure. Notably, ML models identified alcohol and potassium intake as important predictors that were not selected by traditional linear regression, underscoring their ability to capture nuanced relationships. Conclusions: These results demonstrate that data-driven ML models using a complex dataset can identify features and patterns associated with an important nutrient that might be missed using traditional statistical approaches and could be used to identify individuals at risk of inadequate hydration. Clinical trial registration: NCT00427193, clinicaltrials.gov.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


