Background/objective Structured reporting improves the completeness and consistency of radiology documentation and may reduce error rates. This study evaluated a deep learning–based natural language processing (NLP) framework for the automatic conversion of free-text rectal MRI reports, acquired for local staging, into standardized structured reports. Methods Two expert radiologists authored 151 synthetic free-text rectal MRI reports and completed corresponding structured versions following the template of the Italian Society of Medical and Interventional Radiology (SIRM). These paired reports were used to fine-tune a large-scale pretrained text-to-text model (mT5) to extract 200 categorical variables corresponding to the fields of the structured report. Performance was assessed using accuracy, balanced accuracy, and F1-score. To mitigate class imbalance, a second training session was carried out after performing a targeted permutation of underrepresented variables. Results In the first training session, the model achieved an accuracy (mean ± SD) of 0.96 ± 0.08, a balanced accuracy of 0.93 ± 0.16, and an F1-score of 0.96 ± 0.09. In the second training session, after permutation, performance was an accuracy of 0.94 ± 0.09, a balanced accuracy of 0.85 ± 0.23, and an F1-score of 0.93 ± 0.10. Conclusions The mT5 model achieved high performance in the automatic structuring of rectal cancer staging MRI reports, without significant degradation after targeted permutation of variables in the second training session.
Automated population of structured magnetic resonance imaging staging reports for rectal cancer from free-text radiology reports using deep learning–based natural language processing
Fanni, Salvatore Claudio;Lossano, Simone;Uggenti, Vincenzo
;Lizzi, Francesca;Febi, Maria;Aringhieri, Giacomo;Faggioni, Lorenzo;Lencioni, Riccardo;Neri, Emanuele;Cioni, Dania
2026-01-01
Abstract
Background/objective Structured reporting improves the completeness and consistency of radiology documentation and may reduce error rates. This study evaluated a deep learning–based natural language processing (NLP) framework for the automatic conversion of free-text rectal MRI reports, acquired for local staging, into standardized structured reports. Methods Two expert radiologists authored 151 synthetic free-text rectal MRI reports and completed corresponding structured versions following the template of the Italian Society of Medical and Interventional Radiology (SIRM). These paired reports were used to fine-tune a large-scale pretrained text-to-text model (mT5) to extract 200 categorical variables corresponding to the fields of the structured report. Performance was assessed using accuracy, balanced accuracy, and F1-score. To mitigate class imbalance, a second training session was carried out after performing a targeted permutation of underrepresented variables. Results In the first training session, the model achieved an accuracy (mean ± SD) of 0.96 ± 0.08, a balanced accuracy of 0.93 ± 0.16, and an F1-score of 0.96 ± 0.09. In the second training session, after permutation, performance was an accuracy of 0.94 ± 0.09, a balanced accuracy of 0.85 ± 0.23, and an F1-score of 0.93 ± 0.10. Conclusions The mT5 model achieved high performance in the automatic structuring of rectal cancer staging MRI reports, without significant degradation after targeted permutation of variables in the second training session.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


