SciPost logo

SciPost Submission Page

Performance in solving the Hermitian and pseudo-Hermitian Bethe-Salpeter equation with the Yambo code

by Petru Milev, Blanca Mellado-Pinto, Muralidhar Nalabothula, Ali Esquembre Kucukalic, Fernando Alvarruiz, Enrique Ramos, Alejandro Molina-Sanchez, Ludger Wirtz, Jose E. Roman, Davide Sangalli

This is not the latest submitted version.

Submission summary

Authors (as registered SciPost users): Petru Milev
Submission information
Preprint Link: scipost_202507_00069v1  (pdf)
Code repository: https://gitlab.com/lumen-code/lumen
Code version: Lumen Fork
Code license: GPL-2.0 license
Data repository: https://github.com/Petru-Milev/data_for_scipost_submission_1
Date submitted: July 25, 2025, 12:40 p.m.
Submitted by: Petru Milev
Submitted to: SciPost Physics Codebases
Ontological classification
Academic field: Physics
Specialties:
  • Condensed Matter Physics - Computational
Approach: Computational

Abstract

We analyze the performance of two strategies in solving the structured eigenvalue problem deriving from the Bethe-Salpeter equation (BSE) in condensed matter physics. The BSE matrix is constructed with the Yambo code, and the two strategies are implemented by interfacing Yambo with the ScaLAPACK and ELPA libraries for direct diagonalization, and with the SLEPc library for the iterative approach. We consider both the Hermitian (Tamm-Dancoff approximation) and pseudo-Hermitian forms, addressing dense matrices of three different sizes. A description of the implementation is also provided, with details for the pseudo-Hermitian case. Timing and memory utilization are analyzed on both CPU and GPU clusters. Our results demonstrate that it is now feasible to handle dense BSE matrices of the order of 10^5.

Current status:
Has been resubmitted

Reports on this Submission

Report #2 by Anonymous (Referee 2) on 2025-9-29 (Invited Report)

Strengths

1- Comparison of different solvers interfaced with the Yambo electronic structure code in terms of runtime, scalability and memory utilization 2- Detailed analysis of results 3- Good readability and writing style

Weaknesses

1- Suitability for journal questionable 2- Structure and conciseness could be improved 3- Details on used solvers are unclear

Report

The manuscript "Performance in solving the Hermitian and pseudo-Hermitian Bethe-Salpeter equation with the Yambo code" presents runtime experiments and results to compare different solvers that can be used to solve the "Bethe-Salpeter eigenvalue problem" arising in condensed matter physics. The electronic structure code "Yambo" is interfaced with solver libraries (Sca)LAPACK and ELPA for direct diagonalization and SLEPc for an iterative solver. The Yambo default solver "Haydock", directly computing the optical spectrum, is also considered. A lot of valuable data is gathered and put into context.

The manuscript does not present a "Codebase" per sé and therefore does not seem to align with the publication criteria of the journal "SciPost Physics Codebases". "Yambo" would be an example of a codebase, as I understand it, but is not presented in its entirety here. Instead, only some new features, i.e. the ability to interface other solver libraries, are evalualed. These additions are not (yet?) implemented in Yambo's production code, but on the development fork "lumen", further limiting their relevance given the journal's scope.

While the manuscript is pleasantly readable, it would benefit from a more clear and concise structure. If one wants to understand, say, one figure, the relevant information (algorithm, library, hardware, ..) seems scattered all over the manuscript.

In particular, perhaps due to the confusing structure of the manuscript, it is unclear which algorithms exactly were used when talking about "non-Hermitian Algorithms" and "pseudo-Hermitian Algorithms". For "non-Hermitian Algorithms", I am assuming it is a general eigensolver that completely ignores the matrices structure, i.e. LAPACK's "zgeev". But this should be stated explicitly. Even more unclear are the ""pseudo-Hermitian Algorithms". Section 2.1.1. implies that the (complex) eigenvalue problem is recast into a real skew-symmetric eigenvalue problem, which is then solved by a special skew-symmetric solver (see references 9 and 32). However, as far as I'm aware, this type of solver is not available in standard (Sca)LAPACK, but only in ELPA. So what exactly was used in the (Sca)LAPACK case? Or was the eigenvalue problem interpreted as a complex generalized Hermitian problem? Then section 2.1.1. is misleading. This should be clarified.

The manuscript should be sent to a journal whose scope better fits its contents, and should improve its clarity.

Requested changes

Some further open questions and points to improve: 1- Footnotes should not be used as sources, but should be actual footnotes or incorporated into the text where it makes sense. 2- The number of computed eigenvalues (100) in the iterative scheme seems arbitrary and specific, but the conclusions drawn are very general. (Example: “As already observed, iterative solvers are orders faster than diagonalization-based approaches.”) They would be more convincing, if the number of computed eigenvalues were varied as well, if the conclusions are supported by algorithmic arguments (instead of observed empirically) or if there was a good reason to choose this number specifically. 3- How was GPU memory measured? The linux time command does not provide this funcitonality. 4- In the hardware overview, details on the node-interconnect are missing, which are relevant in the experiments of Figure 3, as the experiments surpass the 1-node regime when using more than 32 cores. This impact should be addressed.

Recommendation

Accept in alternative Journal (see Report)

  • validity: good
  • significance: ok
  • originality: low
  • clarity: ok
  • formatting: excellent
  • grammar: good

Report #1 by Anonymous (Referee 1) on 2025-9-29 (Invited Report)

Disclosure of Generative AI use

The referee discloses that the following generative AI tools have been used in the preparation of this report:

OpenAI ChatGPT (GPT-5), used solely to check the English grammar of the report

Report

The present manuscript is clearly written, technically sound, and addresses a computational bottleneck in Bethe-Salpeter equation. This is relevant for condensed matter physics and materials science. The benchmarking against CPU/GPU clusters, as well as the comparison between Hermitian, non-Hermitian, and pseudo-Hermitian formulations, is accurate. The work will be valuable for the community, especially Yambo users. Overall the manuscript is technically strong, with solid contributions.

I support the publication of this work, and would like the authors to reply to the following comments:

1) While the manuscript focuses on performance, it would benefit from a short comment on some scientific applications. For example: what system sizes or particular physical phenomena (e.g. excitons in specific 2D materials, Rydberg states, etc.) become accessible with these advances? Some readers may miss why a BSE matrix with size 10^5 matters physically, or what kind of excitonic spectra these calculations are enabling.

2) As far as I can see, the benchmarks are primarily performed using the BSE Hamiltonian of CrI3. Could the authors clarify to what extent the conclusions obtained from this material are expected to be general? In particular, are there material dependent features (e.g. band dispersion, screening etc.) that might affect the efficiency or scaling?

Recommendation

Publish (easily meets expectations and criteria for this Journal; among top 50%)

  • validity: -
  • significance: -
  • originality: -
  • clarity: -
  • formatting: -
  • grammar: -

Login to report or comment