CINECA IRIS Institutional Research Information System

We propose a CMOS Analog Vector-Matrix Multiplier for Deep Neural Networks, implemented in a standard single-poly 180 nm CMOS technology. The learning weights are stored in analog floating-gate memory cells embedded in current mirrors implementing the multiplication operations. We experimentally verify the analog storage capability of designed single-poly floating-gate cells, the accuracy of the multiplying function of proposed tunable current mirrors, and the effective number of bits of the analog operation. We perform system-level simulations to show that an analog deep neural network based on the proposed vector-matrix multiplier can achieve an inference accuracy comparable to digital solutions with an energy efficiency of 26.4 TOPs/J, a layer latency close to 100 mu s and an intrinsically high degree of parallelism. Our proposed design has also a cost advantage, considering that it can be implemented in a standard single-poly CMOS process flow.

Analog Vector-Matrix Multiplier Based on Programmable Current Mirrors for Neural Network Integrated Circuits

Maksym Paliy^Primo;Sebastiano Strangio^Secondo;Piero Ruiu;Tommaso Rizzo;Giuseppe Iannaccone^Ultimo

2020-01-01

Abstract

We propose a CMOS Analog Vector-Matrix Multiplier for Deep Neural Networks, implemented in a standard single-poly 180 nm CMOS technology. The learning weights are stored in analog floating-gate memory cells embedded in current mirrors implementing the multiplication operations. We experimentally verify the analog storage capability of designed single-poly floating-gate cells, the accuracy of the multiplying function of proposed tunable current mirrors, and the effective number of bits of the analog operation. We perform system-level simulations to show that an analog deep neural network based on the proposed vector-matrix multiplier can achieve an inference accuracy comparable to digital solutions with an energy efficiency of 26.4 TOPs/J, a layer latency close to 100 mu s and an intrinsically high degree of parallelism. Our proposed design has also a cost advantage, considering that it can be implemented in a standard single-poly CMOS process flow.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2020
			
	Codice DOI
	
				https://dx.doi.org/10.1109/access.2020.3037017
			
	Tutti gli autori
	
						Paliy, Maksym; Strangio, Sebastiano; Ruiu, Piero; Rizzo, Tommaso; Iannaccone, Giuseppe
					
	Appare nelle tipologie:
	
				1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
09252918.pdf accesso aperto Descrizione: articolo principale Tipologia: Versione finale editoriale Licenza: Creative commons Dimensione 2.14 MB Formato Adobe PDF Visualizza/Apri	2.14 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11568/1065099

Citazioni

ND

26

22

social impact