SciPost Submission Page
IAFormer: Interaction-Aware Transformer network for collider data analysis
by Waleed Esmail, Ahmed Hammad, Mihoko Nojiri
Submission summary
| Authors (as registered SciPost users): | Waleed Esmail |
| Submission information | |
|---|---|
| Preprint Link: | scipost_202507_00056v2 (pdf) |
| Code repository: | https://github.com/wesmail/IAFormer |
| Date submitted: | Dec. 16, 2025, 2:53 p.m. |
| Submitted by: | Waleed Esmail |
| Submitted to: | SciPost Physics |
| Ontological classification | |
|---|---|
| Academic field: | Physics |
| Specialties: |
|
| Approaches: | Computational, Phenomenological |
Abstract
In this paper, we introduce IAFormer, a novel Transformer-based architecture that efficiently integrates pairwise particle interactions through a dynamic sparse attention mechanism. IAFormer has two new mechanisms within the model. First, the attention matrix depends on predefined boost invariant pairwise quantities, reducing the network parameters significantly from the original particle transformer models. Second, IAformer incorporates the sparse attention mechanism by utilizing the "differential attention", so that it can dynamically prioritize relevant particle tokens while reducing computational overhead associated with less informative ones. This approach significantly lowers the model complexity without compromising performance. Despite being computationally efficient by more than an order of magnitude than the Particle Transformer network, IAFormer achieves state-of-the-art performance in classification tasks on the top and quark-gluon datasets. Furthermore, we employ AI interpretability techniques, verifying that the model effectively captures physically meaningful information layer by layer through its sparse attention mechanism, building an efficient network output that is resistant to statistical fluctuations. IAformer highlights the need for sparse attention in Transformer analysis to reduce the network size while improving its performance.
Author indications on fulfilling journal expectations
- Provide a novel and synergetic link between different research areas.
- Open a new pathway in an existing or a new research direction, with clear potential for multi-pronged follow-up work
- Detail a groundbreaking theoretical/experimental/computational discovery
- Present a breakthrough on a previously-identified and long-standing research stumbling block
Author comments upon resubmission
made the appropriate changes based on the comments of the referee. We have addressed all the points
raised by the referee and believe that our manuscript is now ready for publication.
With best regards,
The authors
