Breast cancer remains a global health concern with a lack of high discriminating prediction models. The k-nearest-neighbor algorithm (kNN) estimates individual risks using an intuitive tool. This study compares the performances of this approach with the Cox and the Gail models for the 5-year breast cancer risk prediction. The study included 64,995 women from the French E3N prospective cohort. The sample was divided into a learning (N = 51,821) series to learn the models using fivefold cross-validation and a validation (N = 13,174) series to evaluate them. The area under the receiver operating characteristic curve (AUC) and the expected over observed number of cases (E/O) ratio were estimated. In the two series, 393 and 78 premenopausal and 537 and 98 postmenopausal breast cancers were diagnosed. The discrimination values of the best combinations of predictors obtained from cross-validation ranged from 0.59 to 0.60. In the validation series, the AUC values in premenopausal and postmenopausal women were 0.583 [0.520; 0.646] and 0.621 [0.563; 0.679] using the kNN and 0.565 [0.500; 0.631] and 0.617 [0.561; 0.673] using the Cox model. The E/O ratios were 1.26 and 1.28 in premenopausal women and 1.44 and 1.40 in postmenopausal women. The applied Gail model provided AUC values of 0.614 [0.554; 0.675] and 0.549 [0.495; 0.604] and E/O ratios of 0.78 and 1.12. This study shows that the prediction performances differed according to menopausal status when using parametric statistical tools. The k-nearest-neighbor approach performed well, and discrimination was improved in postmenopausal women compared with the Gail model.

A comparison between different prediction models for invasive breast cancer occurrence in the French E3N cohort

BAGLIETTO, LAURA;
2015-01-01

Abstract

Breast cancer remains a global health concern with a lack of high discriminating prediction models. The k-nearest-neighbor algorithm (kNN) estimates individual risks using an intuitive tool. This study compares the performances of this approach with the Cox and the Gail models for the 5-year breast cancer risk prediction. The study included 64,995 women from the French E3N prospective cohort. The sample was divided into a learning (N = 51,821) series to learn the models using fivefold cross-validation and a validation (N = 13,174) series to evaluate them. The area under the receiver operating characteristic curve (AUC) and the expected over observed number of cases (E/O) ratio were estimated. In the two series, 393 and 78 premenopausal and 537 and 98 postmenopausal breast cancers were diagnosed. The discrimination values of the best combinations of predictors obtained from cross-validation ranged from 0.59 to 0.60. In the validation series, the AUC values in premenopausal and postmenopausal women were 0.583 [0.520; 0.646] and 0.621 [0.563; 0.679] using the kNN and 0.565 [0.500; 0.631] and 0.617 [0.561; 0.673] using the Cox model. The E/O ratios were 1.26 and 1.28 in premenopausal women and 1.44 and 1.40 in postmenopausal women. The applied Gail model provided AUC values of 0.614 [0.554; 0.675] and 0.549 [0.495; 0.604] and E/O ratios of 0.78 and 1.12. This study shows that the prediction performances differed according to menopausal status when using parametric statistical tools. The k-nearest-neighbor approach performed well, and discrimination was improved in postmenopausal women compared with the Gail model.
2015
Dartois, Laureen; Gauthier, Émilien; Heitzmann, Julia; Baglietto, Laura; Michiels, Stefan; Mesrine, Sylvie; Boutron Ruault, Marie Christine; Delaloge, Suzette; Ragusa, Stéphane; Clavel Chapelon, Françoise; Fagherazzi, Guy
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11568/818022
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 9
  • ???jsp.display-item.citation.isi??? 9
social impact