Current train delay (TD) prediction systems do not take advantage of state-of-the-art tools and techniques for handling and extracting useful and actionable information from the large amount of endogenous (i.e., generated by the railway system itself) and exogenous (i.e., related to railway operation but generated by external phenomena) data available. Additionally, they are not designed in order to deal with the intrinsic time varying nature of the problem (e.g., regular changes in the nominal timetable, etc.). The purpose of this paper is to build a dynamic data-driven TD prediction system that exploits the most recent tools and techniques in the field of time varying big data analysis. In particular, we map the TD prediction problem into a time varying multivariate regression problem that allows exploiting both historical data about the train movements and exogenous data about the weather provided by the national weather services. The performance of these methods have been tuned through the state-of-the-art thresholdout technique, a very powerful procedure which relies on the differential privacy theory. Finally, the performance of two efficient implementations of shallow and deep extreme learning machines that fully exploit the recent in-memory large-scale data processing technologies have been compared with the current state-of-the-art TD prediction systems. Results on real-world data coming from the Italian railway network show that the proposal of this paper is able to remarkably improve the state-of-the-art systems.

Dynamic delay predictions for large-scale railway networks: Deep and shallow extreme learning machines tuned via thresholdout

Oneto, Luca;
2017-01-01

Abstract

Current train delay (TD) prediction systems do not take advantage of state-of-the-art tools and techniques for handling and extracting useful and actionable information from the large amount of endogenous (i.e., generated by the railway system itself) and exogenous (i.e., related to railway operation but generated by external phenomena) data available. Additionally, they are not designed in order to deal with the intrinsic time varying nature of the problem (e.g., regular changes in the nominal timetable, etc.). The purpose of this paper is to build a dynamic data-driven TD prediction system that exploits the most recent tools and techniques in the field of time varying big data analysis. In particular, we map the TD prediction problem into a time varying multivariate regression problem that allows exploiting both historical data about the train movements and exogenous data about the weather provided by the national weather services. The performance of these methods have been tuned through the state-of-the-art thresholdout technique, a very powerful procedure which relies on the differential privacy theory. Finally, the performance of two efficient implementations of shallow and deep extreme learning machines that fully exploit the recent in-memory large-scale data processing technologies have been compared with the current state-of-the-art TD prediction systems. Results on real-world data coming from the Italian railway network show that the proposal of this paper is able to remarkably improve the state-of-the-art systems.
2017
Oneto, Luca; Fumeo, Emanuele; Clerico, Giorgio; Canepa, Renzo; Papa, Federico; Dambra, Carlo; Mazzino, Nadia; Anguita, Davide
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11568/996670
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 91
  • ???jsp.display-item.citation.isi??? 80
social impact