SciPost Submission Page
Mass Agnostic Jet Taggers
by Layne Bradshaw, Rashmish K. Mishra, Andrea Mitridate, Bryan Ostdiek
This Submission thread is now published as
Submission summary
Authors (as registered SciPost users): | Layne Bradshaw · Bryan Ostdiek |
Submission information | |
---|---|
Preprint Link: | scipost_201909_00004v5 (pdf) |
Date accepted: | 2019-12-20 |
Date submitted: | 2019-12-13 01:00 |
Submitted by: | Bradshaw, Layne |
Submitted to: | SciPost Physics |
Ontological classification | |
---|---|
Academic field: | Physics |
Specialties: |
|
Approaches: | Theoretical, Experimental |
Abstract
Searching for new physics in large data sets needs a balance between two competing effects---signal identification vs background distortion. In this work, we perform a systematic study of both single variable and multivariate jet tagging methods that aim for this balance. The methods preserve the shape of the background distribution by either augmenting the training procedure or the data itself. Multiple quantitative metrics to compare the methods are considered, for tagging 2-, 3-, or 4-prong jets from the QCD background. This is the first study to show that the data augmentation techniques of Planing and PCA based scaling deliver similar performance as the augmented training techniques of Adversarial NN and uBoost, but are both easier to implement and computationally cheaper.
Author comments upon resubmission
List of changes
-We shortened the abstract as Referee 2 suggested.
-As per the comments made by Referee 1 and 3 regarding the details of how the signal and background events were generated, we added details to Sec. 2 about how the 2- and 3-prong signals were generated, and clarify that the background events were generated at leading order in the QCD coupling.
-Regarding Referee 2’s concern regarding the distinction between multivariate and machine learning methods, we acknowledge the distinction between the two and make clear that we are using the two terms interchangeably in the discussion that follows. This change is made at the beginning of Sec. 3
-Referee 1 asked that we define the variable beta in Eq. 1, and we have clarified that beta is a real number in Sec. 3.1. To remove any possible confusion with the variable beta used in the uBoost algorithm, we have renamed that hyperparameter as beta_{u} to make clear that this is a different beta than the one used in defining the N-subjettiness variables.
-Referee 3 asked that we clarify that the X basis introduced in Eq. 4 is used by both multivariate classifiers, and we have added a sentence immediately following Eq. 4 making this clarification.
-Referee 1 asked that we explain the colored lines appearing in Fig. 4 and elsewhere. In the captions of Figs. 4,5, and 9, we explain that the colored lines correspond to different values of the signal efficiency.
-Referee 3 requested that we add or explain the absence of Boosted Decision Tree performance in Sec. 3.2, as well as explaining whether there are any penalties in terms of application time for the data augmentation techniques. In Sec. 3.2, prior to explaining any of the data augmentation techniques, we explain that we are only showing results for the Neural Networks when introducing these techniques, but that they can be applied to either. We also explain that these methods are fast, and that we expect the application-time cost to be minimal.
-Referee 1 had concerns regarding our definition of the background rejection over the whole mass range we consider. To the beginning of Sec. 4, we addressed these concerns by explaining how we expect our this choice to give qualitatively the same results that we would have found if we defined the background rejection over a narrower window centered around the signal, since all of the taggers considered in this work are tasked with keeping the background rejection constant over the whole mass range.
-Referee 2 asked us to explain what the takeaway from Table 2 is meant to be. We clarified this in the caption.
-Referee 3 pointed out that our statement regarding the prongedness of the QCD background in Sec. 4.1 was imprecise. We have made this correction.
-Referee 1 was concerned that the different choices for the x-axes in Figs. 12,14, and 15 incorrectly give the impression that the 4-prong events are sculpted more than the 2- or 3-prong signals. We changed the x-axes of these figures so that the background rejection for all of the signals are in the range 10^3-10^0. The trend is much easier to see now.
-Referee 2 and 3 asked that we distinguish our outlook from the conclusion to make the latter both shorter and easier to understand. We added a new section, Sec. 5, where we discuss the outlook and future work. We also shortened our conclusion---now in Sec. 6---to make the main takeaways of our work more clear.
-Regarding Referee 3’s concerns about whether the data augmentation techniques have any impact on the statistical power of the experimental data, we explain at the beginning of Sec. 5 that we don’t expect this to be the case since the PCA based approach involves only a linear transformation of the data, and Planing only requires that the training data (likely from Monte Carlo) be reweighted.
Published as SciPost Phys. 8, 011 (2020)
Reports on this Submission
Report 2 by Tilman Plehn on 2019-12-17 (Invited Report)
Report
Thank you for taking into account my comments, very nice work!
Anonymous Report 1 on 2019-12-17 (Invited Report)
- Cite as: Anonymous, Report on arXiv:scipost_201909_00004v5, delivered 2019-12-17, doi: 10.21468/SciPost.Report.1399
Report
Thanks to the authors for taking on board most of the points I made, and I don't regard the remainder as problems. I do think the statistical impact of reweighting will need to be considered in practice -- there is increasing awareness that MC samples are also expensive, and a danger for future experimental runs that their statistics even without reweighting will be a limiting factor. But this is an additional point of context, and on the raw science of comparing decorrelation methods this paper is a valuable contribution.