SciPost Submission Page
Physics-informed neural networks viewpoint for solving the Dyson-Schwinger equations of quantum electrodynamics
by Rodrigo Carmo Terin
This is not the latest submitted version.
Submission summary
Authors (as registered SciPost users): | Rodrigo Carmo Terin |
Submission information | |
---|---|
Preprint Link: | scipost_202503_00026v1 (pdf) |
Date submitted: | March 18, 2025, 10:02 a.m. |
Submitted by: | Carmo Terin, Rodrigo |
Submitted to: | SciPost Physics |
Ontological classification | |
---|---|
Academic field: | Physics |
Specialties: |
|
Approaches: | Theoretical, Computational, Phenomenological |
Abstract
Physics-informed neural networks (PINNs) are employed to solve the Dyson--Schwinger equations of quantum electrodynamics (QED) in Euclidean space, with a focus on the non-perturbative generation of the fermion's dynamical mass function in the Landau gauge. By inserting the integral equation directly into the loss function, our PINN framework enables a single neural network to learn a continuous and differentiable representation of the mass function over a spectrum of momenta. Also, we benchmark our approach against a traditional numerical algorithm showing the main differences among them. Our novel strategy, which can be extended to other quantum field theories, paves the way for forefront applications of machine learning in high-level theoretical physics.
Author indications on fulfilling journal expectations
- Provide a novel and synergetic link between different research areas.
- Open a new pathway in an existing or a new research direction, with clear potential for multi-pronged follow-up work
- Detail a groundbreaking theoretical/experimental/computational discovery
- Present a breakthrough on a previously-identified and long-standing research stumbling block
Current status:
Reports on this Submission
Strengths
1) Very interesting and relevant topic.
Weaknesses
1) Numerical results does not match with state-of-the-art results.
Report
The manuscript investigates the interesting topic of utilising physics-informed neural networks to solve Dyson-Schwinger equations. This method is widely used to go beyond the limitations of current numerical techniques to solve differential equations. The topic and application are relevant for publication; unfortunately, it's poorly implemented. As stated by the author in the conclusion, the results do not match the state-of-the-art numeric results, rendering the implementation meaningless. However, I am confident that this problem is solvable. Thus, I can not suggest for publication unless the results at least match the state-of-the-art numerical results. To this end, I might be able to suggest ways to work around inefficiencies.
- On page 12, the author states that the NN is ineffective in learning small B values. The domain indeed has a significant effect on what PINN can learn. May I suggest reparametrising the problem so that the target is scaled somewhere between [10^-3, 1]. Constraining, rescaling or reparametrising the problem always helps NN to learn. Additionally, introducing boundaries might help constrain the problem.
- I also want to point out the second bullet point raised by the author on the same page regarding numpy, which has been raised previously as well. That is entirely irrelevant for the success of the paper. There are TensorFlow-based integration modules using differentiable integration tools. Such tools include MC techniques, PyTorch-tensorflow-jax-based trapezoid methods, or even a Normalising flow-based integrator [2001.05486] (although this is not needed for this particular endeavour). All of these are independent of the problem that the author is raising. I suggest that the author check 2209.15190, 2211.02834, DOI: 10.1007/s40819-022-01288-3, all of which solve integral equations, and none have problems with numpy. The author's problem here is completely technical and solvable. This is since TF can not form a graph structure for a Pythonic codebase. Hence, this has nothing to do with the scientific outcome or relevance of the paper and should be removed (and fixed).
I genuinely believe that this paper is highly relevant, and the author can address the aforementioned issues. Unfortunately, I cannot recommend it for publication until these are resolved.
I have one further question:
- Given that one has to use all these approximations to be able to numerically solve this problem (with traditional tools), and the strength of NN approach is to provide the ability to avoid all these approximations. Can the author address if its possible to solve this problem with PINN whithout using at least a subset of approximations introduced in the manuscript. If the answer is no, then what is the advantage of using PINN for this particular problem?
Requested changes
I have further minor requests in the text:
1) In equation 3-4 mass and coupling term has not been defined until eq 15 and 16. It would help for readablitiy to address their nature after they are introduced. 2) The notation in section C is a bit confusing. I expect NN to approximate B(x), thus I would use B(x)\to\tilde{B}(x) notation (or something along these lines) where by tilde I wanted to indicate NN approximation. The notation in the paper looks like author is using a supervise type learning where B_target(x) is coming from a numeric solution. But I believe that is not what author is doing. So I would write the loss function as
L = mean( \tilde{B}(x) - constant * Int(\tilde{B} ... ) ... )
Otherwise eq 22 indicates some sort of supervised learning which renders application irrelevant. That is not author is doing with PINN so it should be clarified.
3) Fig 1 shows only one single line, it would be more relevant if author uses dashed and dotted lines so that the other curves are visible as well 4) Fig 4 has to be updated, I am expecting it to match Fig 2 but, unless my eyes deceive me, its not even close. Additionally the image has been cut, I would suggest fixing the aspect ratio. 5) I strongly suggest author to rewrite page 12-13, I am confident that it is possible to resolve all the issues that are raised.
Recommendation
Ask for major revision
Strengths
- clearly written
- interesting application
- PINN for integral equations as a novel method
Weaknesses
- numerical validity difficult to asses
- approximations made cast doubt on generality of the method
Report
Requested changes
My specific points are
1. The numerics relies on the "rainbow approximation" for treating the implicit/coupled integral equations. I'm wondering whether this restrict the generality of the PINN method to this type of problem. I'd like the author to comment on this.
2. Similarly, the assumption of Landau gauge simplify the equations, but the method itself should work independent of gauge- a argument would be likewise appreciated.
3. The author should include a comment on the choice of loss function in equation 22. In addition, it'd be great to know whether the author has encountered the fall-back of PINNs onto trivial solutions, for instance B=0, instead of the physically wanted solution.
4. My largest issue are the figures: First of all, I'd like to suggest merging the figures 1 and 3, and 2 and 4 into pairs, as they show show the same results and are thought to demonstrate the validity of the PINN as a numerical method. Then, I'd be important to give A(p^2) in a more meaningful representation. I'm a bit surprised by the differences between the results shown in figures 2 and 4: they are very different, and I can't read the y-axis label clearly. To my view, a comparison between a standard numerical technique and the PINN results is necessary before embarking on conclusions.
Recommendation
Ask for major revision
Author: Rodrigo Carmo Terin on 2025-05-14 [id 5478]
(in reply to Report 1 on 2025-04-01)
Comment 1: Rainbow truncation and generality
Reviewer:
“The numerics relies on the ‘rainbow approximation’ for treating the coupled integral equations. Does this restrict the generality of the PINN method?”
Reply: Indeed, the author adopts the rainbow truncation as a well-known first step that captures fundamental non-perturbative features, e.g., dynamical mass generation. However, the proposed PINN framework is expected to not be restricted to this approximation, but of course the author agrees that it is good research practice to proceed cautiously and investigate calmly this topic. As mentioned in sections III and V, in future works more sophisticated truncation schemes such as including vertex corrections or the coupled photon–fermion system will be considered and are expected to be incorporated directly into the loss function without modifying the principles of the neural network approach implemented in this work.
Comment 2: Gauge independence
Reviewer: “The choice of Landau gauge simplifies the equations, but the method should work independent of gauge—please comment.”
Reply: Landau gauge was selected to refine the presentation and simplify the leading-order behavior of A. Nonetheless, the method is in principle desired to be gauge independent. One can readily modify the loss function to include the appropriate integral expressions for A and B in other gauges. The same PINN architecture and training scheme apply. As mentioned in section V, the author is currently investigating for further work the possibility to study the unfamous gauge (in)dependence issue explicitly. However, it is worth noting that traditional numerical algorithms are generally gauge dependent when truncations are employed, unless specific care is taken to preserve gauge invariance, for instance, using Ward–Takahashi identities. Although the author is confident that a gauge-invariant PINN formulation could be developed under suitable conditions, the current expectation within the Dyson–Schwinger equations community has been that this novel method should first reproduce established numerical results, which are often regarded as “real” benchmarks. From this viewpoint, PINNs are expected to reflect the same gauge dependence as standard discretization-based approaches, without introducing additional artifacts beyond those arising from the chosen truncation.
Comment 3: Loss function and trivial solutions
Reviewer: “Please explain the choice of loss in Eq.~(22) and whether the PINN ever collapsed to trivial solutions (e.g. B = 0).” Reply:
The baseline loss in Equation (22) consists of a standard mean-squared error (MSE) between the predicted and target values of B. However, in very early experiments the network often collapsed to the trivial solution B ≈ 0, which is mathematically consistent but physically irrelevant. To mitigate this, the following measures were taken:
- A softplus activation was used to ensure positivity of B.
- The bias of the output layer was initialized to match B ~ 10⁻³ in the infrared.
- Additional loss components were introduced, targeting mid, high, and UV momentum intervals with log-scale penalties.
These refinements are detailed in Section III, subsections A and C, and were essential to recovering physically meaningful solutions that match traditional numerical results, especially the last one (which is the novelty of the present version of the work).
Comment 4: Figures and comparison with the traditional solver
Reviewer:
“Please merge Figs.~1 & 3 and Figs.~2 & 4 into side-by-side panels, improve the A(p²) presentation, and include a direct comparison to the standard numerical method.”
Reply: The figures have been updated accordingly. The author now presents side-by-side plots comparing the traditional numerical approach and the PINN results for both A and B (see Figures 1 and 2 in the revised manuscript). Axis labels and font sizes have been standardized for readability. Since A = 1 identically in Landau gauge, this fact is now emphasized in both the figure caption and the discussion in Section IV, subsection A. In the updated figure 2 (which replaces the figure 1 of the previous version) all curves corresponding to different values of alpha appear superposed as a single line. This is because, under the rainbow truncation in the above-mentioned gauge, the function A =1 is an exact solution of the DSE for the fermion propagator. This gauge/truncation choice removes the contribution of the vector part of the fermion self-energy, and thus A does not acquire any nontrivial momentum dependence. The identical curves reflect this known analytic behavior.
Finally, the side-by-side comparison clearly shows that the updated PINN framework, with improved loss components and initialization strategies, achieves excellent agreement with the benchmark traditional method over all momentum scales.
Author: Rodrigo Carmo Terin on 2025-05-14 [id 5479]
(in reply to Report 2 on 2025-04-28)Comment: Matching state-of-the-art numerics
Reviewer:
“The PINN results do not match state-of-the-art numerical solutions, making the implementation seem meaningless unless it at least matches them.”
Reply: The author acknowledges that quantitative agreement with traditional numerical algorithms remains important, particularly since those methods often define the standard for validating PINN-based results. In the revised section IV, subsection A, the following improvements were introduced to address both viewpoints:
As a result, the PINN now reproduces the benchmark solution for B. The improved performance is visualized in the updated Figure 1.
Comment: Use of NumPy vs. differentiable tools
Reviewer:
“The NumPy reliance is irrelevant; use TF or JAX integrators (e.g. MC or tensorized trapezoid).” Reply:
All calls to NumPy within training loops have been removed. The entire PINN training pipeline is now implemented using JAX with differentiable operations throughout. This change enables full gradient flow through the integral operators and lines up the author´s approach with best practices in differentiable programming.
Comment: What is the fundamental advantage of using a PINN?
Reviewer:
“Given all the approximations needed even for traditional solvers, what unique advantage does PINN offer? Can one drop any of those approximations entirely?”
Reply: In section IV, subsection B, the author emphasizes the potential strengths and limitations of the PINN framework:
However, the author is aware that, since validation against traditional numerical algorithms is typically expected by the established DSEs community, the absence of reliable benchmark solutions from the standard methods in certain regimes or extended set of equations may limit the ability to confirm the correctness of the PINN’s predictions.
Minor Points:
Reply: m and alpha are now defined immediately after Equations (3) and (4) in section II.
Reply: To prevent confusion, the author defined the PINN output as B̃(x), reserving B(x) for the reference solution. The author also clarifies that the training is not supervised in the classical sense. As stated in section III, subsection C, the PINN learns by minimizing the residual between its own prediction and the integral operator applied to it, consistent with the standard PINN methodology.
Reply: The updated figure 2 (the figure 1 of the previous version) all curves for different values of alpha appear superposed as a single line because, under the rainbow truncation in Landau gauge, the function A =1 is an exact solution of the DSE for the fermion propagator. This gauge/truncation choice removes the contribution of the vector part of the fermion self-energy, and thus A does not acquire any nontrivial momentum dependence. The identical curves reflect this known analytic behavior. We have clarified this point in the caption of figure 2 and throughout the text of section IV, subsection A. 4. Fixes to figure 4:
Reply: Figure 1 (figure 4 in the previous version) has been regenerated with the same axes, limits, and aspect ratio as figure 2 (previous version), showing excellent agreement and were included side-by-side.
Reply: These pages have been reviewed to detail the changes made to the model and loss, report new error metrics, and remove any language suggesting the PINN was ineffective. Finally, the author is grateful for the reviewer’s thoughtful comments, which have significantly improved the clarity, technical correctness, and relevance of the manuscript.