In the last years, Convolutional Neural networks (CNNs) found applications in many fields from computer vision to speech recognition, showing outstanding results in terms of accuracy. Field Programmable Gate Arrays (FPGAs) proved to be a promising platform for running CNN algorithms because they offer a remarkable trade-off between power consumption and computational power. However, an efficient implementation of CNN models on-board an FPGA represents a complex task since CNN massive parallel processing is often limited by FPGA storage capabilities and design congestion. This article introduces MEM-OPT, a scheduling algorithm and data re-use system that aims to optimize on-chip memory usage on-board FPGAs for what concerns input feature maps storage and Processing Elements multiply and accumulation process. The work presents MEM-OPT implementations results on a Xilinx XC7Z020, including hardware resources, maximum clock frequency and power consumption. MEM-OPT memory requirements are analyzed for LeNet-5, MobileNet, VGG-16 and other state-of-the-art CNNs, showing, a reduction up to 80% of the overall on-chip memory necessary for storing input feature maps and accumulating output results with respect to alternative solutions available in the literature.
MEM-OPT: A Scheduling and Data Re-Use System to Optimize On-Chip Memory Usage for CNNs On-Board FPGAs
Dinelli G.
Primo
;Meoni G.Secondo
;Rapuano E.;Pacini T.Penultimo
;Fanucci L.Ultimo
2020-01-01
Abstract
In the last years, Convolutional Neural networks (CNNs) found applications in many fields from computer vision to speech recognition, showing outstanding results in terms of accuracy. Field Programmable Gate Arrays (FPGAs) proved to be a promising platform for running CNN algorithms because they offer a remarkable trade-off between power consumption and computational power. However, an efficient implementation of CNN models on-board an FPGA represents a complex task since CNN massive parallel processing is often limited by FPGA storage capabilities and design congestion. This article introduces MEM-OPT, a scheduling algorithm and data re-use system that aims to optimize on-chip memory usage on-board FPGAs for what concerns input feature maps storage and Processing Elements multiply and accumulation process. The work presents MEM-OPT implementations results on a Xilinx XC7Z020, including hardware resources, maximum clock frequency and power consumption. MEM-OPT memory requirements are analyzed for LeNet-5, MobileNet, VGG-16 and other state-of-the-art CNNs, showing, a reduction up to 80% of the overall on-chip memory necessary for storing input feature maps and accumulating output results with respect to alternative solutions available in the literature.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.