The edge of chaos: quantum field theory and deep neural networks

Kevin T. Grosvenor, Ro Jefferson

Research output: Contribution to journalArticleAcademicpeer-review

Abstract

We explicitly construct the quantum field theory corresponding to a general class of deep neural networks encompassing both recurrent and feedforward architectures. We first consider the mean-field theory (MFT) obtained as the leading saddlepoint in the action, and derive the condition for criticality via the largest Lyapunov exponent. We then compute the loop corrections to the correlation function in a perturbative expansion in the ratio of depth T to width N, and find a precise analogy with the well-studied O(N) vector model, in which the variance of the weight initializations plays the role of the 't Hooft coupling. In particular, we compute both the O(1) corrections quantifying fluctuations from typicality in the ensemble of networks, and the subleading O(T/N) corrections due to finite-width effects. These provide corrections to the correlation length that controls the depth to which information can propagate through the network, and thereby sets the scale at which such networks are trainable by gradient descent. Our analysis provides a first-principles approach to the rapidly emerging NN-QFT correspondence, and opens several interesting avenues to the study of criticality in deep neural networks.
Original languageEnglish
Article number81
Number of pages65
JournalSciPost Phys.
Volume12
Issue number3
DOIs
Publication statusPublished - Mar 2022

Bibliographical note

Funding Information:
It is a pleasure to thank James Giammona, Jim Halverson, Anindita Maiti, Dan Roberts, Kee-gan Stoner, and Sho Yaida for comments on a draft of this manuscript, as well as Johanna Erdmenger, Boris Hanin, and Soon H. Lim for discussions. K.T.G. acknowledges financial support from the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy through the Würzburg-Dresden Cluster of Excellence on Complexity and Topology in Quantum Matter ct.qmat (EXC 2147, project id 390858490), as well as the Hallwachs-Röntgen Postdoc Program of ct.qmat. K.T.G. has also received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No 101024967.

Publisher Copyright:
Copyright K. T. Grosvenor and R. Jefferson.

Keywords

  • Computation
  • Large-n limit

Fingerprint

Dive into the research topics of 'The edge of chaos: quantum field theory and deep neural networks'. Together they form a unique fingerprint.

Cite this