SciPost Submission Page
Super-Resolving Normalising Flows for Lattice Field Theories
by Marc Bauer, Renzo Kapust, Jan Martin Pawlowski, Finn Leon Temmen
Submission summary
| Authors (as registered SciPost users): | Renzo Kapust |
| Submission information | |
|---|---|
| Preprint Link: | scipost_202502_00013v2 (pdf) |
| Date accepted: | Aug. 29, 2025 |
| Date submitted: | July 4, 2025, 4:30 p.m. |
| Submitted by: | Renzo Kapust |
| Submitted to: | SciPost Physics |
| Ontological classification | |
|---|---|
| Academic field: | Physics |
| Specialties: |
|
| Approach: | Computational |
Abstract
We propose a renormalisation group inspired normalising flow that combines benefits from traditional Markov chain Monte Carlo methods and standard normalising flows to sample lattice field theories. Specifically, we use samples from a coarse lattice field theory and learn a stochastic map to the targeted fine theory. The devised architecture allows for systematic improvements and efficient sampling on lattices as large as $128 \times 128$ in all phases when only having sampling access on a $4\times 4$ lattice. This paves the way for reaping the benefits of traditional MCMC methods on coarse lattices while using normalising flows to learn transformations towards finer grids, aligning nicely with the intuition of super-resolution tasks. Moreover, by optimising the base distribution, this approach allows for further structural improvements besides increasing the expressivity of the model.
Author indications on fulfilling journal expectations
- Provide a novel and synergetic link between different research areas.
- Open a new pathway in an existing or a new research direction, with clear potential for multi-pronged follow-up work
- Detail a groundbreaking theoretical/experimental/computational discovery
- Present a breakthrough on a previously-identified and long-standing research stumbling block
Author comments upon resubmission
List of changes
- (Page 1) We now mention already in the introduction that the actions on the coarse and fine lattice have the same structural form. On the same note, we mention that the fine couplings are fixed and the coarse couplings are optimized w.r.t. the sampling quality on the fine lattice. This information would other wise be only provided in section V.
- (Page3) In Eq. 9 we now refrain from calling $\Tilde{p}(\Tilde{\phi})$ the density's push forward, and merely introduce the notation where the $\Tilde{}$ referes to objects that were pushed through the map $\mathcal{T}_\theta$.
- (Page5) We now stress that the noise scale is a single parameter for each upsampling layer $\mathcal{T}_{\theta_i}$
- (Page 6) We now corrected a notational quibble. Because the learned map is only a diffeomorphism w.r.t (coarse dof, noise dof) and the (fine dof), we now also denote the dependence of the $\log \det J_\mathcal{T}$ term respectively.
- (Page 7) Same as above.
- (Page 8) We note that the couplings must not evolve along the lines of constant physics, but rather evolve during the training such that the sampling on the fine lattice becomes optimal.
- (Page 9) We now mention explicitly that we train a flow (and correspondingly a coupling) for each fine targeted lattice size and fine coupling.
Published as SciPost Phys. 19, 077 (2025)
Reports on this Submission
Report #2 by Anonymous (Referee 2) on 2025-8-27 (Invited Report)
- Cite as: Anonymous, Report on arXiv:scipost_202502_00013v2, delivered 2025-08-26, doi: 10.21468/SciPost.Report.11810
Report
I wish to thank the referees for addressing my comments and questions. All of my substantive outstanding concerns have been alleviated. I have one last minor issue to point out, as well as a few parting comments and optional minor suggestions that the authors may choose to incorporate. However, these points are not critical, so I am happy to recommend the work for publication in any case without further review.
- It seems like a term for the density of the noise draws $\zeta$ is missing from the loss as defined in equations 31-33. It should be present formally (unless it is implicit in some of the other definitions). More importantly, the noise distribution has a learnable parameter $\sigma_\zeta$, which must enter somewhere in the loss in order to be optimized.
Follow-ups on previous points:
- The authors are correct that there are no obvious indications of any problem with the ESS evaluation at batch size 256.
There is a separate but related point that is of minor importance in the context of the present work, but which is worth taking into consideration in the future. In the paper, the ESS estimates and uncertainties are computed from the ESSes of 3 different models, obtained by training identically up to random seed. Although this not an unreasonable thing to do when some notion of uncertainty is desired, I would argue there are better and safer ones. - Averaging ESSes over models quantifies what typical sampler quality is achieved by some training protocol of interest, but one is not obligated to consider or sample multiple different models. Instead, one can and should simply take the most performant model found during training. The typical model quality isn't interesting, only the best achievable one. - When an error on the ESS is desired, it is better to evaluate it using multiple independent batches for a fixed model. Taking an average over multiple models mixes together the finite-batch uncertainty on the ESS with unrelated training noise. - It is more natural to average the ESS in inverse (or more generally consider the statistics of 1/ESS). This also avoids missing when the ESS is truly near-zero but high-variance in such a way that it typically evaluates as finite. This is an unfortunately common failure mode, but would be visible from the min/max-based definition of error bars used in this work, so it does not seem to be an issue here.
-
Just to be clear, my point was that any violation of the symmetries of the true distribution in the model will result in reduced ESS. If the ODE flow kernel is translationally equivariant, it cannot possibly learn to correct for the breaking of translation symmetry by the noise insertion step. Using a non-equivariant ODE kernel can actually improve performance in this case by allowing it to compensate. However, this is a matter of optimization and not critical given the good performance already demonstrated.
-
Thank you for addressing my quibble. However, one last bit of notation still merits quibbling over. The upscaling map $\mathcal{T}_\theta(\varphi)$, at least when the symbol is used as in Eq. 9, has an implicit $\zeta$ argument and should really be written $\mathcal{T}_\theta(\varphi,\zeta)$. As written, it makes it appear as if the flow is either injective into the fine space from the coarse space, rather than bijective from the coarse + noise spaces, or as if its definition involves a marginalization of some sort. I was significantly confused about this initially, so making it explicit could improve readability. However, one can figure out from the broader context that there must be an implicit $\zeta$ input, so this is not an essential change.
Recommendation
Publish (easily meets expectations and criteria for this Journal; among top 50%)
Strengths
1- novel approach to combine normalising flow with RG inspired ideas in lattice field theory 2- clearly written 3- provides ample scope for further exploration
Weaknesses
1- the RG aspect is only “RG inspired”, one could quibble about what a proper RG transformation would do (but we won’t)
Report
Requested changes
No further changes requested.
Recommendation
Publish (easily meets expectations and criteria for this Journal; among top 50%)
