SciPost Submission Page
Large Language Models -- the Future of Fundamental Physics?
by Caroline Heneka, Florian Nieser, Ayodele Ore, Tilman Plehn, Daniel Schiller
Submission summary
| Authors (as registered SciPost users): | Ayodele Ore · Tilman Plehn · Daniel Schiller |
| Submission information | |
|---|---|
| Preprint Link: | scipost_202507_00080v2 (pdf) |
| Code repository: | https://github.com/heidelberg-hepml/L3M |
| Date submitted: | Dec. 12, 2025, 12:17 p.m. |
| Submitted by: | Daniel Schiller |
| Submitted to: | SciPost Physics |
| Ontological classification | |
|---|---|
| Academic field: | Physics |
| Specialties: |
|
| Approach: | Computational |
Abstract
For many fundamental physics applications, transformers, as the state of the art in learning complex correlations, benefit from pretraining on quasi-out-of-domain data. The obvious question is whether we can exploit Large Language Models, requiring proper out-of-domain transfer learning. We show how the Qwen2.5 LLM can be used to analyze and generate SKA data, specifically 3D maps of the cosmological large-scale structure for a large part of the observable Universe. We combine the LLM with connector networks and show, for cosmological parameter regression and lightcone generation, that this Lightcone LLM (L3M) with Qwen2.5 weights outperforms standard initialization and compares favorably with dedicated networks of matching size.
Author indications on fulfilling journal expectations
- Provide a novel and synergetic link between different research areas.
- Open a new pathway in an existing or a new research direction, with clear potential for multi-pronged follow-up work
- Detail a groundbreaking theoretical/experimental/computational discovery
- Present a breakthrough on a previously-identified and long-standing research stumbling block
Author comments upon resubmission
List of changes
Current status:
Reports on this Submission
Report
[74] DeepSeek-AI et al., Deepseek-r1: Incentivizing reasoning capability in llms via
reinforcement learning, arXiv:2501.12948 [cs.CL].
[83] E. Tang, B. Yang, and X. Song, Understanding llm embeddings for regression,
arXiv:2411.14708 [cs.LG
[86] T. Liu and B. K. H. Low, Goat: Fine-tuned llama outperforms gpt-4 on arithmetic tasks,
arXiv:2305.14201 [cs.LG].
[90] M. Jin, S. Wang, L. Ma, Z. Chu, J. Y. Zhang, X. Shi, P.-Y. Chen, Y. Liang, Y.-F. Li, S. Pan,
and Q. Wen, Time-llm: Time series forecasting by reprogramming large language models,
arXiv:2310.01728 [cs.LG].
[91] T. Zhou, P. Niu, X. Wang, L. Sun, and R. Jin, One fits all:power general time series
analysis by pretrained lm, arXiv:2302.11939 [cs.LG].
[92] C. Chang, W. Peng, and T.-F. Chen, Llm4ts: Aligning pre-trained llms as data-efficient
time-series forecasters, ACM Transactions on Intelligent Systems and Technology
(2023)
[95] S. G. Murray, B. Greig, A. Mesinger, J. B. Muñoz, Y. Qin, J. Park, and C. A. Watkinson,
21cmfast v3: A python-integrated c code for generating 3d realizations of the cosmic
21cm signal., Journal of Open Source Software 5 (2020) 54, 2582.
[104] D. A. Nix and A. S. Weigend, Estimating the mean and variance of the target probability distribution, in Proceedings of 1994 ieee international conference on neural networks
(ICNN’94), IEEE. 1994.
Recommendation
Publish (meets expectations and criteria for this Journal)
