This review examines data-driven road traffic safety modeling, aiming to provide a comprehensive overview of the state-of-the-art and persistent research gaps. The study is structured around data sources, influencing factors, reactive and proactive modeling approaches, and key challenges. Data sources, including crashes, trajectories, traffic, roadway geometry, and environmental data, are first reviewed in the context of reactive and proactive safety analysis. To address the substantial heterogeneity across studies, a vote-counting strategy is adopted to aggregate directional evidence reported in the literature. The synthesis indicates that traffic demand variables exhibit consistently positive associations with crash occurrence, while speed-related effects are strongly context-dependent. Road geometry and surface conditions have largely consistent directional impacts on safety outcomes. From a methodological perspective, reactive approaches remain dominant, while proactive approaches exhibit potential for early risk identification but remain insufficiently validated due to data quality constraints. In addition, empirical evidence on conflict–crash relationships is still limited. Notably, model performance varies substantially across safety tasks, with algorithm effectiveness primarily driven by data structure, outcome definition, and aggregation level, rather than by the intrinsic superiority of any single approach. Overall, this review highlights challenges related to data integration, spatio-temporal modeling, interpretability, and transferability, and provides practical guidance for model selection in operational road safety analysis.
Data-Driven Road Traffic Safety Modeling: A Comprehensive Literature Review
Wang C.;Fiorentini N.;Riccardi C.
;Losa M.
2026-01-01
Abstract
This review examines data-driven road traffic safety modeling, aiming to provide a comprehensive overview of the state-of-the-art and persistent research gaps. The study is structured around data sources, influencing factors, reactive and proactive modeling approaches, and key challenges. Data sources, including crashes, trajectories, traffic, roadway geometry, and environmental data, are first reviewed in the context of reactive and proactive safety analysis. To address the substantial heterogeneity across studies, a vote-counting strategy is adopted to aggregate directional evidence reported in the literature. The synthesis indicates that traffic demand variables exhibit consistently positive associations with crash occurrence, while speed-related effects are strongly context-dependent. Road geometry and surface conditions have largely consistent directional impacts on safety outcomes. From a methodological perspective, reactive approaches remain dominant, while proactive approaches exhibit potential for early risk identification but remain insufficiently validated due to data quality constraints. In addition, empirical evidence on conflict–crash relationships is still limited. Notably, model performance varies substantially across safety tasks, with algorithm effectiveness primarily driven by data structure, outcome definition, and aggregation level, rather than by the intrinsic superiority of any single approach. Overall, this review highlights challenges related to data integration, spatio-temporal modeling, interpretability, and transferability, and provides practical guidance for model selection in operational road safety analysis.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


