In recent years many theories explaining the behavior of Wide Neural Networks have been proposed, focusing on relations of wide networks with Neural Tangent Kernels and on devising a novel optimization theory for overparameterized models. However, despite the efforts, real-world models are still not well-understood. To this aim, we empirically measure crucial quantities for neural networks in the more realistic setting of mildly overparameterized models and in three main areas: conditioning of the optimization process, training speed, and generalization of the obtained models. We analyze the obtained results and highlight discrepancies between existing theories and realistic models, to guide future works on theoretical refinements. Our contribution is exploratory in nature and aims to encourage the development of mixed theoretical-practical approaches, where experiments are quantitative and aimed at measuring fundamental quantities of the existing theories.

An Empirical Verification of Wide Networks Theory

Davide Bacciu
2022-01-01

Abstract

In recent years many theories explaining the behavior of Wide Neural Networks have been proposed, focusing on relations of wide networks with Neural Tangent Kernels and on devising a novel optimization theory for overparameterized models. However, despite the efforts, real-world models are still not well-understood. To this aim, we empirically measure crucial quantities for neural networks in the more realistic setting of mildly overparameterized models and in three main areas: conditioning of the optimization process, training speed, and generalization of the obtained models. We analyze the obtained results and highlight discrepancies between existing theories and realistic models, to guide future works on theoretical refinements. Our contribution is exploratory in nature and aims to encourage the development of mixed theoretical-practical approaches, where experiments are quantitative and aimed at measuring fundamental quantities of the existing theories.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11568/1190667
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact