SciPost Submission Page
Uncertainties associated with GAN-generated datasets in high energy physics
by Konstantin T. Matchev, Alexander Roman, Prasanth Shyamsundar
Submission summary
| Authors (as registered SciPost users): | Konstantin T. Matchev · Prasanth Shyamsundar |
| Submission information | |
|---|---|
| Preprint Link: | https://arxiv.org/abs/2002.06307v4 (pdf) |
| Date accepted: | Feb. 28, 2022 |
| Date submitted: | Feb. 17, 2022, 8:19 p.m. |
| Submitted by: | Prasanth Shyamsundar |
| Submitted to: | SciPost Physics |
| Ontological classification | |
|---|---|
| Academic field: | Physics |
| Specialties: |
|
| Approach: | Phenomenological |
Abstract
Recently, Generative Adversarial Networks (GANs) trained on samples of traditionally simulated collider events have been proposed as a way of generating larger simulated datasets at a reduced computational cost. In this paper we point out that data generated by a GAN cannot statistically be better than the data it was trained on, and critically examine the applicability of GANs in various situations, including a) for replacing the entire Monte Carlo pipeline or parts of it, and b) to produce datasets for usage in highly sensitive analyses or sub-optimal ones. We present our arguments using information theoretic demonstrations, a toy example, as well as in the form of a formal statement, and identify some potential valid uses of GANs in collider simulations.
List of changes
Published as SciPost Phys. 12, 104 (2022)
