SciPost logo

SciPost Submission Page

Full Event Particle-Level Unfolding with Variable-Length Latent Variational Diffusion

by Alexander Shmakov, Kevin Greif, Michael James Fenton, Aishik Ghosh, Pierre Baldi, Daniel Whiteson

Submission summary

Authors (as registered SciPost users): Michael James Fenton
Submission information
Preprint Link:  (pdf)
Date submitted: 2024-04-24 20:16
Submitted by: Fenton, Michael James
Submitted to: SciPost Physics
Ontological classification
Academic field: Physics
  • High-Energy Physics - Experiment
  • High-Energy Physics - Phenomenology
Approaches: Experimental, Computational


The measurements performed by particle physics experiments must account for the imperfect response of the detectors used to observe the interactions. One approach, unfolding, statistically adjusts the experimental data for detector effects. Recently, generative machine learning models have shown promise for performing unbinned unfolding in a high number of dimensions. However, all current generative approaches are limited to unfolding a fixed set of observables, making them unable to perform full-event unfolding in the variable dimensional environment of collider data. A novel modification to the variational latent diffusion model (VLD) approach to generative unfolding is presented, which allows for unfolding of high- and variable-dimensional feature spaces. The performance of this method is evaluated in the context of semi-leptonic top quark pair production at the Large Hadron Collider.

Author indications on fulfilling journal expectations

  • Provide a novel and synergetic link between different research areas.
  • Open a new pathway in an existing or a new research direction, with clear potential for multi-pronged follow-up work
  • Detail a groundbreaking theoretical/experimental/computational discovery
  • Present a breakthrough on a previously-identified and long-standing research stumbling block
Current status:
In refereeing

Reports on this Submission

Anonymous Report 1 on 2024-6-7 (Invited Report)


- The method and architecture is well described and a lot of details are given
- It fills a gap in the literature: it is the first method to unfold full-event (variable dimensional) collider data


- one of the main results of the paper, that the unfolding performance is independent of the training data distribution, could be explained / investigated in more detail


The manuscript describes a novel machine learning architecture to unfold full particle physics events, i.e. variable dimensional collider data. Unfolding describes the inversion of the detector effects, allowing for long-term usage of experimental data without the need of expensive detector simulation. The manuscript is very well written and provides a lot of details on the architecture. In particular, I like the illustrative figure 1 and the many subsections in section 3 that describe all the individual elements of the model.

My biggest concern is about the observations in section 4.3 in combination with statement in section 6: "This lack of prior dependence strongly motivates the use of VLD for unfolding.". It is true that the model performs well on a different distribution as seen in training on the dataset considered here. But, it is not clear that this generalizes beyond this example. I understand that the authors cannot look at many processes within the scope of this manuscript, but I would at least like to see some explanation or investigation of why such a behavior would be expected. If the authors want to keep the sentence in the conclusion, they need to provide more explanantion / tests / examples to support the claim.
Apart from this, I have a few minor proposals that could enhance the manuscript, see the list below.

I therefore think that the manuscript should be published in SciPost Physics after the concerns have been addressed.

Requested changes

- One of the strengths of generative unfolding: event-by-event uncertainties, (since a given detector level input can be unfolded several times) could be mentioned in the introduction, since this an additional reason to use this method compared to discriminative methods.
- In section 3.1, is O_P_0 treated differently by the position-equivariant transformer with respect to the other O_P_i? Please explain.
- In section 3.3, why is y_0 needed? Please explain.
- In section 3.4, how is ensured that the ordering is constant over t? Or is that not a problem to be concerned of?
- In section 4.1, have both coordinate representations P^cart and P^Polar been used simultaneously (concatenated)?
- In sections 4.2 and 4.3, to better visualize the correlations between the observables and how well these are learned by the model, I suggest to add corner plots (for example to an appendix). Maybe this also explains the performance on down-stream observables a bit more.
- In sections 4.2 and 4.3, in addition to the metrics shown in the tables, I think it would be nice to see how well a neural classifier (see discussion in 2305.16774) would be able to distinguish unfolded from true events.
- In figures 5c and 7b (c), I suggest to zoom in on the bottom panel a bit.
- In the tables, I suggest to add a test of 'truth vs truth' to get a better feeling for the natural spread of the metrics. (i.e. is a distance of 0.04 a lot?)
- The authors refer a lot to reference 18, which is ok. However, it would be nice if the definitions of the metrics used in the tables could be replicated here as well and are not kept in Appendix C of Ref 18 only.
- In table 3, the errors of VAE and diffusion seem to add perfectly to the Unfolding error. Is that by construction or a non-trivial cross check on how the metrics are evaluated?
- Please make your code and training data available via git / zenodo / others.


Ask for minor revision

  • validity: high
  • significance: top
  • originality: top
  • clarity: top
  • formatting: perfect
  • grammar: perfect

Login to report or comment