Improvements in semiconductor nanotechnology made chip multiprocessors the reference architecture for high-performance microprocessors. CMPs usually adopt large Last-Level Caches (LLC) shared among cores and private L1 caches, whose performances depend on the wire-delay dominated response time of LLC. NUCA (NonUniform Cache Architecture) caches represent a viable solution for tolerating wire-delay effects. In this article, we present Re-NUCA, a NUCA cache that exploits replication of blocks inside the LLC to avoid performance limitations of D-NUCA caches due to conflicting access to shared data. Results show that a Re-NUCA LLC permits to improve performances of more than 5% on average, and up to 15% for applications that strongly suffer from conflicting access to shared data, while reducing network traffic and power consumption with respect to D-NUCA caches. Besides, it outperforms different S-NUCA schemes optimized with victim replication.
|Autori:||P. Foglia; M. Solinas|
|Titolo:||Exploiting Replication to Improve Performances of NUCA-Based CMP Systems|
|Anno del prodotto:||2014|
|Digital Object Identifier (DOI):||10.1145/2566568|
|Appare nelle tipologie:||1.1 Articolo in rivista|