SciPost Submission Page
Foundation models for high-energy physics
by Anna Hallin
Submission summary
| Authors (as registered SciPost users): | Anna Hallin |
| Submission information | |
|---|---|
| Preprint Link: | https://arxiv.org/abs/2509.21434v1 (pdf) |
| Date submitted: | Sept. 29, 2025, 10:03 a.m. |
| Submitted by: | Anna Hallin |
| Submitted to: | SciPost Physics Proceedings |
| Proceedings issue: | The 2nd European AI for Fundamental Physics Conference (EuCAIFCon2025) |
| Ontological classification | |
|---|---|
| Academic field: | Physics |
| Specialties: |
|
| Approaches: | Experimental, Computational, Phenomenological |
Abstract
The rise of foundation models -- large, pretrained machine learning models that can be finetuned to a variety of tasks -- has revolutionized the fields of natural language processing and computer vision. In high-energy physics, the question of whether these models can be implemented directly in physics research, or even built from scratch, tailored for particle physics data, has generated an increasing amount of attention. This review, which is the first on the topic of foundation models in high-energy physics, summarizes and discusses the research that has been published in the field so far.
Current status:
Reports on this Submission
Report #1 by Tobias Golling (Referee 1) on 2025-10-29 (Invited Report)
Report
Requested changes
Please update this reference please:
M. Leigh, S. Klein, F. Charton, T. Golling, L. Heinrich, M. Kagan, I. Ochoa, M. Osadchy, “Is Tokenization Needed for Masked Particle Modelling?,” 2025 Mach. Learn.: Sci. Technol. 6 025075 [2409.12589].
See https://iopscience.iop.org/article/10.1088/2632-2153/addb98
For ATLAS references please use ATLAS Collaboration for the author, same as done for CMS (see e.g. Ref. 18 and 20)
Refs 17-20 are a bit arbitrary. I don't have a good suggestion how to fix it. Either add many more references to give a representative view or focus on summary papers or remove them altogether as they are not relevant in the end.
Suggest to elaborate on the possible modalities in HEP (such as Standard high-level reco objects, Pflow objects, Tracks and clusters, Raw data (hits…), various (latents) truth modalities, trigger objects as well as language).
In Section 2 the case is made that the focus will rest on bullet three. This is well justified as the main work in the community is on this bullet. This should be clarified and maybe the order adjusted according to the relevance in our community, i.e. start with bullet 3.
I find Section 3 problematic. I understand that the author is an expert on this particular model, however as this document claims to be a review paper on the state of the art (see abstract), I advocate against using it as a platform to highlight the own work. Maybe this is just my personal preference. But I worry it sends the wrong message to the community. My suggestion is to drop this section. Some of the details could be moved to Section 4 but care should be taken to keep a healthy balance between the various methods and approaches.
A final (personal) request: it would be very useful for the community to arrange and compare the methods visually along the various axes which are mentioned in the text. I know that the information is already in the current text. Would be nice to have this - if it is not too much work. Such a sketch could become a reference in our community to show the most relevant differences (regarding pre-training techniques, supervision, physics encoding, input features, downstream tasks, etc.) - as was done to some extent in the talk.
Last sentence in the conclusion: I am wondering if you wish to reference EuCAIF WG1 directly as an opportunity here?
Recommendation
Ask for minor revision
