SciPost Phys. 12, 081 (2022) ·
published 3 March 2022
We explicitly construct the quantum field theory corresponding to a general class of deep neural networks encompassing both recurrent and feedforward architectures. We first consider the mean-field theory (MFT) obtained as the leading saddlepoint in the action, and derive the condition for criticality via the largest Lyapunov exponent. We then compute the loop corrections to the correlation function in a perturbative expansion in the ratio of depth T to width N, and find a precise analogy with the well-studied O(N) vector model, in which the variance of the weight initializations plays the role of the 't Hooft coupling. In particular, we compute both the O(1) corrections quantifying fluctuations from typicality in the ensemble of networks, and the subleading O(T/N) corrections due to finite-width effects. These provide corrections to the correlation length that controls the depth to which information can propagate through the network, and thereby sets the scale at which such networks are trainable by gradient descent. Our analysis provides a first-principles approach to the rapidly emerging NN-QFT correspondence, and opens several interesting avenues to the study of criticality in deep neural networks.
Johanna Erdmenger, Kevin T. Grosvenor, Ro Jefferson
SciPost Phys. 12, 041 (2022) ·
published 26 January 2022
We investigate the analogy between the renormalization group (RG) and deep neural networks, wherein subsequent layers of neurons are analogous to successive steps along the RG. In particular, we quantify the flow of information by explicitly computing the relative entropy or Kullback-Leibler divergence in both the one- and two-dimensional Ising models under decimation RG, as well as in a feedforward neural network as a function of depth. We observe qualitatively identical behavior characterized by the monotonic increase to a parameter-dependent asymptotic value. On the quantum field theory side, the monotonic increase confirms the connection between the relative entropy and the c-theorem. For the neural networks, the asymptotic behavior may have implications for various information maximization methods in machine learning, as well as for disentangling compactness and generalizability. Furthermore, while both the two-dimensional Ising model and the random neural networks we consider exhibit non-trivial critical points, the relative entropy appears insensitive to the phase structure of either system. In this sense, more refined probes are required in order to fully elucidate the flow of information in these models.
Johanna Erdmenger, Kevin T. Grosvenor, Ro Jefferson
SciPost Phys. 8, 073 (2020) ·
published 6 May 2020
Motivated by the increasing connections between information theory and high-energy physics, particularly in the context of the AdS/CFT correspondence, we explore the information geometry associated to a variety of simple systems. By studying their Fisher metrics, we derive some general lessons that may have important implications for the application of information geometry in holography. We begin by demonstrating that the symmetries of the physical theory under study play a strong role in the resulting geometry, and that the appearance of an AdS metric is a relatively general feature. We then investigate what information the Fisher metric retains about the physics of the underlying theory by studying the geometry for both the classical 2d Ising model and the corresponding 1d free fermion theory, and find that the curvature diverges precisely at the phase transition on both sides. We discuss the differences that result from placing a metric on the space of theories vs. states, using the example of coherent free fermion states. We compare the latter to the metric on the space of coherent free boson states and show that in both cases the metric is determined by the symmetries of the corresponding density matrix. We also clarify some misconceptions in the literature pertaining to different notions of flatness associated to metric and non-metric connections, with implications for how one interprets the curvature of the geometry. Our results indicate that in general, caution is needed when connecting the AdS geometry arising from certain models with the AdS/CFT correspondence, and seek to provide a useful collection of guidelines for future progress in this exciting area.
SciPost Phys. 6, 042 (2019) ·
published 5 April 2019
We show how the traversable wormhole induced by a double-trace deformation of
the thermofield double state can be understood as a modular inclusion of the
algebras of exterior operators. The effect of this deformation is the creation
of a new region of spacetime deep in the bulk, corresponding to a non-trivial
center between the left and right algebras. This set-up provides a precise
framework for investigating how black hole interiors are encoded in the CFT. In
particular, we use modular theory to demonstrate that state dependence is an
inevitable feature of any attempt to represent operators behind the horizon.
Building on this geometrical structure, we propose that modular inclusions may
provide a more precise means of investigating the nascent relationship between
entanglement and geometry in the context of the emergent spacetime paradigm.
Shira Chapman, Jens Eisert, Lucas Hackl, Michal P. Heller, Ro Jefferson, Hugo Marrochio, Robert C. Myers
SciPost Phys. 6, 034 (2019) ·
published 15 March 2019
Motivated by holographic complexity proposals as novel probes of black hole
spacetimes, we explore circuit complexity for thermofield double (TFD) states
in free scalar quantum field theories using the Nielsen approach. For TFD
states at t = 0, we show that the complexity of formation is proportional to
the thermodynamic entropy, in qualitative agreement with holographic complexity
proposals. For TFD states at t > 0, we demonstrate that the complexity evolves
in time and saturates after a time of the order of the inverse temperature. The
latter feature, which is in contrast with the results of holographic proposals,
is due to the Gaussian nature of the TFD state of the free bosonic QFT. A novel
technical aspect of our work is framing complexity calculations in the language
of covariance matrices and the associated symplectic transformations, which
provide a natural language for dealing with Gaussian states. Furthermore, for
free QFTs in 1+1 dimension, we compare the dynamics of circuit complexity with
the time dependence of the entanglement entropy for simple bipartitions of
TFDs. We relate our results for the entanglement entropy to previous studies on
non-equilibrium entanglement evolution following quenches. We also present a
new analytic derivation of a logarithmic contribution due to the zero momentum
mode in the limit of vanishing mass for a subsystem containing a single degree
of freedom on each side of the TFD and argue why a similar logarithmic growth
should be present for larger subsystems.
Dr Jefferson: "Here is a complete list of cha..."
in Submissions | submission on The edge of chaos: quantum field theory and deep neural networks by Kevin T. Grosvenor and Ro Jefferson