SciPost Submission Page
CapsNets Continuing the Convolutional Quest
by Sascha Diefenbacher, Hermann Frost, Gregor Kasieczka, Tilman Plehn, Jennifer M. Thompson
Submission summary
| Authors (as registered SciPost users): | Sascha Diefenbacher · Tilman Plehn · Jennifer Thompson |
| Submission information | |
|---|---|
| Preprint Link: | https://arxiv.org/abs/1906.11265v3 (pdf) |
| Date accepted: | Dec. 2, 2019 |
| Date submitted: | Nov. 4, 2019, 1 a.m. |
| Submitted by: | Sascha Diefenbacher |
| Submitted to: | SciPost Physics |
| Ontological classification | |
|---|---|
| Academic field: | Physics |
| Specialties: |
|
| Approaches: | Theoretical, Experimental |
Abstract
Capsule networks are ideal tools to combine event-level and subjet information at the LHC. After benchmarking our capsule network against standard convolutional networks, we show how multi-class capsules extract a resonance decaying to top quarks from both, QCD di-jet and the top continuum backgrounds. We then show how its results can be easily interpreted. Finally, we use associated top-Higgs production to demonstrate that capsule networks can work on overlaying images to go beyond calorimeter information.
Author comments upon resubmission
List of changes
1 - Add discussion of differences between original capsule implementation and the current method. - Both implementations are, by design, identical, to clarify this we added: ``Analogous to the original capsule paper, we transition between convolutional and capsule part by re-shaping...'' in section 4.
2 - Add information about how many routings were used.
- We used 3 routings, as was shown to be optimal in other
studies. We have rephrased a sentence to reflect this:
We repeated this for a chosen number of routings, where
three iterations have in other studies given the best results''
now readsWe repeated this for 3 routings, which has been
shown in other studies to give the best results''
3 - More information about the preprocessing. The authors mention that CapsNets need less preprocessing, and it sounds like only scaling the images so the most intense pixel has a value of 1.0 was done. Is this the same for the benchmarks of the Rutgers DeepTop Taggers as done here? - We have added: ``In contrast to the minimal pre-processing we use for the event image capsule network, for the Rutgers tagger and the jet image capsule network we employ the full pre-processing for each jet as described in Ref. [10]. The jets are selected and centered around the $p_T$ weighted centroid of the jet, and rotated such that the major principal axis is vertical. The image is then flipped to ensure that the maximum activity is in the upper-right-hand quadrant. Finally, the images are pixelated and normalized."
4 - Compare CapsNets to networks of similar architecture but without the Capsules for the Pooling CapsNets'' architecture of Figure 6. - We have now made this comparison, and have included the plot in a response to the report. We observe a small but persistent increase in performance by using capsule networks over a dense network with a similar architecture. This comes along with the advantage of having the capsule vectors themselves, which provide a window into how the network is making decisions.
5 - Consider adding a W′ signal to see how CapsNets deal with signal which has some substructure signals and similar kinematics. While pre-selection should be able to deal with some of the differences here, it would be for the study of the CapsNet themselves. - A W' analysis would be an interesting new application for our network, but it would be a whole project in itself and falls outside the scope of our current publication.
6 - Some mention of how the results of [43] compare to the t¯tH classifier used here. - We have added ``For this set-up we find comparable performance to Ref. [43], with an AUC of 0.715, which is slightly above their upper limit.''
7 - [Optional] Publicly available code or code snippets. - Unfortunately, we are unable to dedicate the time to make a public code useful to the community.
Published as SciPost Phys. 8, 023 (2020)
