In longitudinal clinical studies, methodologies available for the analysis of multivariate data with multivariate methods are relatively limited. Here, we present Consensus Clustering (CClust) a new computational method based on clustering of time pro les and posterior identi cation of correlation between clusters and predictors. Subjects are rst clustered in groups according to a response variable temporal pro le, using a robust consensus-based strategy. To discover which of the remaining variables are associated with the resulting groups, a non-parametric hypothesis test is performed between groups at every time point, and then the results are aggregated according to the Fisher method. Our approach is tested through its application to the EarlyBird cohort database, which contains temporal variations of clinical, metabolic, and anthropometric pro les in a population of 150 children followed-up annually from age 5 to age 16. Our results show that our consensus-based method is able to overcome the problem of the approach-dependent results produced by current clustering algorithms, producing groups de ned according to Insulin Resistance (IR) and biological age (Tanner Score). Moreover, it provides meaningful biological results con rmed by hypothesis testing with most of the main clinical variables. These results position CClust as a valid alternative for the analysis of multivariate longitudinal data.

Consensus Clustering of temporal profiles for the identification of metabolic markers of pre-diabetes in childhood (EarlyBird 73)

Priami, Corrado;
2018-01-01

Abstract

In longitudinal clinical studies, methodologies available for the analysis of multivariate data with multivariate methods are relatively limited. Here, we present Consensus Clustering (CClust) a new computational method based on clustering of time pro les and posterior identi cation of correlation between clusters and predictors. Subjects are rst clustered in groups according to a response variable temporal pro le, using a robust consensus-based strategy. To discover which of the remaining variables are associated with the resulting groups, a non-parametric hypothesis test is performed between groups at every time point, and then the results are aggregated according to the Fisher method. Our approach is tested through its application to the EarlyBird cohort database, which contains temporal variations of clinical, metabolic, and anthropometric pro les in a population of 150 children followed-up annually from age 5 to age 16. Our results show that our consensus-based method is able to overcome the problem of the approach-dependent results produced by current clustering algorithms, producing groups de ned according to Insulin Resistance (IR) and biological age (Tanner Score). Moreover, it provides meaningful biological results con rmed by hypothesis testing with most of the main clinical variables. These results position CClust as a valid alternative for the analysis of multivariate longitudinal data.
2018
Lauria, Mario; Persico, Maria; Dordevic, Nikola; Cominetti, Ornella; Matone, Alice; Hosking, Joanne; Jeffery, Alison; Pinkney, Jonathan; Da Silva, Laeticia; Priami, Corrado; Montoliu, Ivan; Martin, François-Pierre
File in questo prodotto:
File Dimensione Formato  
Lauria_et_al-2018-Scientific_Reports.pdf

accesso aperto

Descrizione: Main article
Tipologia: Versione finale editoriale
Licenza: Creative commons
Dimensione 5.95 MB
Formato Adobe PDF
5.95 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11568/894462
Citazioni
  • ???jsp.display-item.citation.pmc??? 2
  • Scopus 4
  • ???jsp.display-item.citation.isi??? 5
social impact