SciPost Submission Page
Accuracy of Restricted Boltzmann Machines for the onedimensional $J_1J_2$ Heisenberg model
by Luciano Loris Viteritti, Francesco Ferrari, Federico Becca
This is not the latest submitted version.
This Submission thread is now published as
Submission summary
Authors (as registered SciPost users):  Francesco Ferrari · Luciano Loris Viteritti 
Submission information  

Preprint Link:  https://arxiv.org/abs/2202.07576v1 (pdf) 
Date submitted:  20220221 17:23 
Submitted by:  Viteritti, Luciano Loris 
Submitted to:  SciPost Physics 
Ontological classification  

Academic field:  Physics 
Specialties: 

Approach:  Computational 
Abstract
Neural networks have been recently proposed as variational wave functions for quantum manybody systems [G. Carleo and M. Troyer, Science 355, 602 (2017)]. In this work, we focus on a specific architecture, known as Restricted Boltzmann Machine (RBM), and analyse its accuracy for the spin1/2 $J_1J_2$ antiferromagnetic Heisenberg model in one spatial dimension. The ground state of this model has a nontrivial sign structure, especially for $J_2/J_1>0.5$, forcing us to work with complexvalued RBMs. Two variational Ans\"atze are discussed: one defined through a fully complex RBM, and one in which two different realvalued networks are used to approximate modulus and phase of the wave function. In both cases, translational invariance is imposed by considering linear combinations of RBMs, giving access also to the lowestenergy excitations at fixed momentum $k$. We perform a systematic study on small clusters to evaluate the accuracy of these wave functions in comparison to exact results, providing evidence for the supremacy of the fully complex RBM. Our calculations show that this kind of Ans\"atze is very flexible and describes both gapless and gapped ground states, also capturing the incommensurate spinspin correlations and lowenergy spectrum for $J_2/J_1>0.5$. The RBM results are also compared to the ones obtained with Gutzwillerprojected fermionic states, often employed to describe quantum spin models [F. Ferrari, A. Parola, S. Sorella and F. Becca, Phys. Rev. B 97, 235103 (2018)]. Contrary to the latter class of variational states, the fullyconnected structure of RBMs hampers the transferability of the wave function from small to large clusters, implying an increase of the computational cost with the system size.
Current status:
Reports on this Submission
Anonymous Report 2 on 2022325 (Invited Report)
 Cite as: Anonymous, Report on arXiv:2202.07576v1, delivered 20220325, doi: 10.21468/SciPost.Report.4771
Report
Developing accurate variational methods is one of the central challenges in computational science and physics. Recently, Carleo and Troyer introduced variational wave functions based on artificial neural networks. Given that this research field is still in an early stage, it is an important task to perform systematic benchmark calculations to confirm the accuracy of neural network variational ansatz.
In the paper, the authors systematically investigate the accuracy of the restricted Boltzmann machine (RBM) wave function for the spin 1/2 J1J2 Heisenberg model in one spatial dimension.
First, the authors show that complex RBM (cRBM) performs better than phasemodulus RBM (pmRBM). Then, the study focuses on the cRBM.
The authors pay attention to the sign structure of the wave function. They then show that the Marshallsign rule helps the optimization, especially when J2 is small. The best cRBM results show better accuracy in the calculation of the ground state, both in energy and correlation functions, compared to the Gutzwillerprojected fermionic states, albeit with a much larger number of variational parameters.
By considering linear combinations of cRBMs, excited states can also be investigated (the accuracy of the groundstate is also improved). In this paper, it is shown that cRBMs give highly accurate results for excited states as well.
Finally, the authors discuss the size consistency and discuss several possibilities to improve size consistent behavior.
The paper is clearly written, and the accuracy of the RBM variational ansatz is carefully and systematically investigated. I believe that the present work is one of the important pieces of recent intensive investigations of neuralnetwork quantum states. Thus, I recommend that this paper be accepted for publication in SciPost.
Below, I list several points (all are minor).
In Fig. 6, the definition of N_MC is not very clear. In particular, what is the difference between N_opt in Fig. 4 and N_MC in Fig. 6?
In Fig. 8, and in page 12, the overlap < Psi_0  Psi_cRBM > should be a complex number. Do the authors show the absolute value ?
According to the inset of Fig. 11 right panel, the relative error of the pBCS at k=0 is smaller than that of the cRBM. However, looking at the right panel, the error of the pBCS appears to be considerably larger than that of the cRBM.
Anonymous Report 1 on 2022317 (Invited Report)
 Cite as: Anonymous, Report on arXiv:2202.07576v1, delivered 20220317, doi: 10.21468/SciPost.Report.4716
Strengths
1 Very timely
2Well and clearly written
3 Thorough performance analysis of restricted Boltzmann machines
in a wellstudied example
4 Good discussion of advantages and drawbacks of RBMs and comparison to other numerical methods
Weaknesses
1No new physics
2No comparison with DMRG attempted here, DMRG being the state of the art method.
Report
Neural networks have emerged as a novel route to study quantum phases
of manybody systems. Here, specifically, the authors are interested in
evaluating the performance of socalled restricted Boltzmann machines (RBM) in characterizing groundstate properties of quantum magnets. Since
the sign structure of the ground state wave function of frustrated
systems hampers the applicability of QMC techniques, the authors decide
to study a onedimensional model of frustrated magnetism, the J1J2 chain.
The sign structure necessitates the use of complexvalued RBMs, for which
two ansaetze are compared. A fully complex network is shown to be superior.
Moreover, translational symmetry is enforced via projection operators
and the a certain sign structure can either be preimposed on the RBM
or not.
The authors compute variational energies, spin correlation functions
and the energies of momentumresolved excited states. These results
are compared to those of Lanczos calculations (which can be considered
an exact benchmark) and a Gutzwiller type variational ansatz (pBCS).
The authors demonstrate good quantitative agreement of the RBM results with exact diagonalization.
They discuss the important question of scaling up the simulations to larger
systems where the considered RBM do not appear to be a promising route. The
physically motivated variational ansatz, however, can be extended to larger
systems in a straightforward manner. Alternative neural networks are identified that may be better suited for larger systems. Moreover, the variational parameters of the RBMs lack a physical interpretation. Further, the number of variational parameters for the pBCS (projected Gutzwiller) is much lower than for the RBMs.
The paper further contains very interesting results on the behavior of the
RBMs during the training phase, e.g., for the phases or the average sign.
The paper is well written and accessible, also to people who do not work
with neural networks/machine learning. It does not provide new insights into
the physics of the studied model, but the performance of the RBMs is very
carefully evaluated and discussed. It appears that RBMs are not the most
promising route for frustrated magnets. The technical details of training
the networks will presumably be of interest to machinelearning practitioners.
Overall, I conclude that this certainly very good research and publishable
science that will be of interest to its target community. The fact that RBMS are very critically evaluated is an important piece of information, followup work
may be triggered along the directions laid out in the conclusions sections.
Requested changes
Necessary revisionis:
1 The Gutzwiller projected wave functions are mentioned in the abstract,
but neither in the abstract nor in the methods section. I may have missed it,
but the acronym is never defined. The authors should a paragraph on this method
in the Methods section, make sure that acronyms are properly introduced and
mention the method in the introduction as well.
2 Please correct a few typos: (page 1) ".. wve functions has been defined ...",
(page 14): "trasparent"
Some optional questions:
3 Could the authors make a more definite statement about the usefulness
of RBMS for 2d frustrated quantum magnets?
4 The method of choice for 1D quantum magnets is still DMRG. Are there
any prospects of RBMs or other neural networks becoming competitive for models of frustrated magnetism?
Author: Luciano Loris Viteritti on 20220415 [id 2388]
(in reply to Report 1 on 20220317)
We thank the referee for her/his positive report. In the following we reply to requested changes:
 The acronym is defined at the beginning of section 3, just before sub section 3.1, and the definiton of Gutzwillerprojected wave functions is reported in the Appendix. The acronym is not used before its definition. The reason not to describe Gutzwillerprojected states in the body of the paper is because they are only used as a comparison for RBM states.
 We corrected the typos.
 & 4. We added a couple of sentences in the introduction to comment these points. As far as the first question is concerned, generic neuralnetwork states can, in principle, describe quantum systems in arbitrary dimension. Howewer, from a practical point of view, higher values of the complexity α could be needed to reach the same accuracy as the one dimensional case. Concerning the second question, DMRG represents the best me thod to solve quantum problems in one dimension, while in two or more dimensions its accuracy deteriorates. In recent years, there have been several attempts to improve both DMRG (by considering tensornetwork states) and RBMs (by defining more refined architectures). Today, the latter ones have reach an accuracy that, in several cases, is competiti ve with DMRG [see for example K. Choo, T. Neupert, G. Carleo Phys. Rev. B 100 (12), 125124 (2019)].
Author: Luciano Loris Viteritti on 20220415 [id 2389]
(in reply to Report 2 on 20220325)We thank the referee for her/his positive report. Here we reply to the minor points: