SciPost Submission Page
How to Unfold Top Decays
by Luigi Favaro, Roman Kogler, Alexander Paasch, Sofia Palacios Schweitzer, Tilman Plehn, Dennis Schwarz
Submission summary
| Authors (as registered SciPost users): | Luigi Favaro · Roman Kogler · Sofia Palacios Schweitzer · Tilman Plehn · Dennis Schwarz |
| Submission information | |
|---|---|
| Preprint Link: | scipost_202505_00008v2 (pdf) |
| Date accepted: | July 24, 2025 |
| Date submitted: | July 9, 2025, 2:50 p.m. |
| Submitted by: | Sofia Palacios Schweitzer |
| Submitted to: | SciPost Physics |
| Ontological classification | |
|---|---|
| Academic field: | Physics |
| Specialties: |
|
| Approaches: | Experimental, Computational, Phenomenological |
Abstract
Using unfolded top-quark decay data we can measure the top quark mass, as well as search for unexpected kinematic effects. We present a new generative unfolding method for the two tasks and show how they both benefit from unbinned, high-dimensional unfolding. Unlike weight-based or iterative generative methods we include a targeted unbiasing with respect to the training data. This shows significant advantages over standard, iterative methods, in terms of applicability, flexibility and accuracy.
Author indications on fulfilling journal expectations
- Provide a novel and synergetic link between different research areas.
- Open a new pathway in an existing or a new research direction, with clear potential for multi-pronged follow-up work
- Detail a groundbreaking theoretical/experimental/computational discovery
- Present a breakthrough on a previously-identified and long-standing research stumbling block
Author comments upon resubmission
Dear Editor-in-Charge,
We strongly believe scientific progress also follows from discussions. Peer-reviewing is part of this process, where valuable constructive feedback can improve scientific work or clarify the scope of future directions. However, there is a difference between authors and referees - aspects a referee would have liked to follow up if he/she had been an author are not the duty of the actual authors.
Let us make a few specific comments: first, we are developing a method to improve a published CMS analysis, which has been reviewed by the collaboration and by a journal. A critical review and update of this benchmark is not our job, because we have to assume that a published CMS result defines the state of the art. If people feel that the internal CMS review process is insufficient, please contact CMS management. Technically, we do not see how applying the OmniFold reweighting to a range of top mass values is `standard’, please be specific and provide us with a suitable reference. It is also not our goal to develop a new OmniFold strategy to deal with too large weights. As a general comment - of course we would have liked to show how our analysis strategy improves the published CMS analysis in all aspects, but CMS policies explicitly keep us from doing this.
Finally, we cannot but condemn the entirely unprofessional decision of one of the referees to not follow up on the discussions and respectfully ask SciPost to remove this referee from their referee database, so other authors will not have to have the same sobering experience. Sincerely, The Authors
Answer to Report 2
Dear Referee,
Thank you for taking the time to review our manuscript a second time. Our remarks/replies are threaded into your report. Sincerely, The Authors
"(1) The reply does not answer my question, so I will go into more detail to explain my concern. - To address the first part of your reply: The whole point of adding the M_jjj information to the batch is to strengthen the impact of m_d. Equation 22 specifically states this fact. - The primary message in your paper is that the use of CFM unbinned unfolding can dramatically reduce the leading systematic in the measurement, but only if a measure of the top mass in data is added to the training. - Backgrounds will then impact on the training if they are incorrectly subtracted from the data. This is not equivalent to noise (random fluctuations). - It is crucial to show that the removal of one systematic is not replaced by a different one of equal size. As written currently, the paper proposes a method but does not conclusively show that the new systematics associated with the method are fully controlled. - The way to estimate this is as follows: create an Asimov dataset containing background events at a given normalisation. Apply background subtraction but with a systematically-shifted normalisation, leaving a residual background in the Asimov dataset. Then you can unfold this data and re-extract the top mass to determine the systematic. - You could also test the background subtraction procedure to see if there is a residual bias there, i.e. add a sample of background in and subtract a statistically different sample that has the same normalisation. This is not as critical, but interesting."
The effect of the background was studied in the CMS analysis. The uncertainties on the rates were conservatively estimated with 19% for W+jets production, 21% for single top quark production, and 100% for other relevant SM processes. The normalization uncertainties in the different backgrounds introduce a shape uncertainty when changing the normalization of single processes. In the cited CMS paper, the overall background uncertainty was estimated to be only 0.01 GeV in the extraction of the top quark mass and thus negligible compared to other uncertainties. With this in mind, we do not consider the background estimation relevant for this paper, but rather leave the details to the experiments carrying out the full measurement. We believe that there is also a misunderstanding in the formulation of the question. The background samples will never impact the training, which happens only on signal (ttbar) simulation. Background events only enter the analysis when the actual unfolding is done, ie the trained networks are used to solve the inversion problem. After the unfolding, the background would be probabilistically subtracted from the unfolded data. Small shifts in the value of m_d, as would be introduced by background contributions to the M_jjj distribution, will have a negligible impact on the result. Having said that, we would like to point out that the actual amount of background is subject to the experimental analysis. There are many handles for suppressing background if one is willing to pay a price in signal efficiency. The balance in the resulting size of uncertainties is a question of optimization, which the experiments have to do when performing the actual data analysis, something that can not be predicted in our study. Finally, If the background is not well modelled by our simulation hence the subtraction is incorrect, bin-wise subtraction as well as continuous subtraction methods would suffer in the same way.
"(5) Thanks for this confirmation that only paired events are used. I looked at the additions to the text and see that the label ‘paired’ is added to equation 10. I couldn’t find another mention of this. I think a wider explanation is needed here, making the points that: (i) the training is done on paired events and presumably (ii) that you restrict the “data” in the unfolding also to events that satisfy truth&reco [otherwise there would be a bias for events not seen in the training]. The point here is that the efficiencies and fiducial factors are not corrected by this method and additional corrections would be needed. It would be good to add a reference to any literature that shows how that is done."
In all unfolding techniques, one needs to correct for the specific efficiencies of the truth and reco selections. The optimal estimation procedure and size of these efficiencies depend strongly on the details of the event selection and reconstruction. We are aware that non-paired events need to be addressed in a realistic measurement, but we leave the details of the implementation to the experiments. This issue is closely related to the treatment of backgrounds, where non-paired events can be treated as background if they are selected at the reconstruction level but are not part of the measurement’s fiducial phase space at the particle level. On the other hand, events that were generated in the fiducial phase space but were not reconstructed because of the detector’s acceptance or an inefficiency will need to be accounted for by an efficiency correction. There are different ways of dealing with these effects. We added a clarification in the text and also give references for efficiency effects, one of these studies corrects for efficiency effects through a classifier. There, we also write explicitly that only “paired events” are used in our study.
"(7) The addition of Omnifold comparisons (at the request of the other referee) raises the question as to whether this was optimised. For example, I wonder what would happen if the dependence on m_s was also weakened in Omnifold by using the three datasets (this is a standard method in ML training). I welcome the addition of this material, but I think it needs to be more rigorously examined to make sure that claims of superior methodology are fully backed up."
The OmniFold results shown in Fig.17 strictly follow the procedure presented in the original formulation. We consider the optimisation of the training procedure adequate since the first reweighting at detector level correctly shifts, within statistical errors, the Mjjj distribution. The pushed particle-level weights from the second OmniFold step would again bias the detector-level Mjjj distribution, thus never reaching convergence. While we do not exclude the possibility that OmniFold can be extended to the unfolding problem studied in this manuscript, we do not see a clear path towards that extension. Both classifiers trained at step one and two directly compare to the data and they will not benefit from a larger set of simulated events, regardless of the additional conditioning. This study is rather a future development direction for OmniFold than an optimisation issue, which we leave to the Omnifold’s authors.
List of changes
- We have added the following discussion the backgrounds at the end of Section 2.1:
“The CMS analysis [34] shows that continuum backgrounds, like $W$+jets production, can be subtracted bin-wise to the level where they are no longer relevant for in the analysis. The normalization uncertainties in the different backgrounds introduce a shape uncertainty when changing the normalization of single processes. While the background normalizations vary between 20-100% in the CMS analysis, the overall background uncertainty was estimated to be only 0.01 GeV in the extraction of the top quark mass and is thus negligible compared to other uncertainties. The method of bin-wise background subtraction can be generalized to the unbinned case with the help of a classifier [52], which suggests that background uncertainties will remain small compared to other systematic uncertainties in this measurement. Therefore, we neglect these in our study and consider signal events only. ”
- We have added the following discussion at the end of Section 2.1:
“We only consider paired events in our signal, i.e. events that passed both reco- and gen-level cuts. Non-paired events can be treated as background if they are selected at the reco-level but are not part of the measurement’s fiducial phase space at the gen-level. On the other hand, events that were generated in the fiducial phase space at gen-level but were not reconstructed because of the detector’s acceptance or an inefficiency will need to be accounted for by an efficiency correction. This can be done through weights, as for example done in the Iterative Bayesian unfolding method [42–45] as implemented in RooUnfold [46] and in TUnfold [47], and successfully applied in several jet substructure analyses at the LHC, see for example Refs. [34,48–50]. Another way to include efficiency and acceptance effects is through a classifier [51], but we leave the details of such a study to future work, as these are closely related to the actual implementation of the data analysis.”
- We have added the following discussion at the end of Appendix A:
"Although the standard OmniFold approach fails in our unfolding tasks, it does not mean that similar adaptions to the algorithm could not lead to unbiased results. However, we leave a concrete investigation of the matter to the OmniFold authors."
Published as SciPost Phys. Core 8, 053 (2025)
