AI-enabled Cybersecurity using Synthetic Data

Baiardi, Fabrizio; Ruggieri, Salvatore; Sammartino, Vincenzo

doi:10.1109/percomworkshops65533.2025.00055

Historical data is not always adequate to train AI models to secure ICT infrastructures due to the dynamic risk landscape. This paper introduces a novel methodology that combines AI-driven adversary simulation with digital twin tech nology to generate synthetic data to train AI models. A security twin extends an infrastructure inventory with information on current vulnerabilities and attacks. By describing threat agents through other twins, we simulate their attack strategies to discover how they exploit the infrastructure’ vulnerabilities and implement their intrusions. A Monte Carlo approach is adopted that runs multiple independent simulations, capturing alternative intrusion scenarios. This method addresses the challenges of data shifts in cybersecurity by producing synthetic data to faithfully describe rapidly evolving environments. This results in more accurate risk management and better resilience. Initial experimental results demonstrate the effectiveness of security twins in assessing and managing the risk due to intrusions. An extension of the digital twin technology to proactive cybersecurity offers significant implications for smart industries, healthcare, and critical infrastructure defence.