GAN Mode Collapse
Challenge 2: GANs Are Prone to Mode Collapse
The generator discovers a small set of samples that reliably fool the discriminator and just generates those over and over, abandoning diversity. From a user experience perspective: you generate a few images, see they all look the same, and the product feels broken.
Mitigation 1: Minibatch Discrimination
In standard GAN training, each sample is generated and evaluated independently. With minibatch discrimination, the discriminator receives multiple samples simultaneously and computes a statistic across them (e.g., pairwise distances). The discriminator can now detect "all these look the same" and penalize it — forcing the generator to diversify.
Mitigation 2: Feature Matching
Standard GAN training has the generator minimize the discriminator's classification loss. Feature matching changes the objective: train the generator to match the feature distributions — intermediate-layer activations — of real and generated samples. The generator isn't just trying to fool the final output; it's trying to produce samples whose internal representations match those of real data. This pushes toward capturing essential data features and tends to stabilize training.
Each dot is a generated sample. The faint circles are the 6 true data modes. Watch what happens as training progresses — without minibatch discrimination, the generator discovers that one or two modes reliably fool the discriminator and collapses to those.
How minibatch discrimination works
Standard GAN training evaluates each generated sample independently. The discriminator can't detect "you keep giving me the same image" because it never sees two generated samples at once. Minibatch discrimination feeds the discriminator a batch of samples simultaneously and adds a statistic (e.g. pairwise distances) as extra input. The discriminator can now penalize lack of diversity — forcing the generator to cover all modes.
Watch a GAN in training: the distribution of generated samples starts diverse, then collapses to a few modes. Toggle minibatch discrimination on and see how it resists the collapse.
You deploy a conditional GAN that generates marketing images from product descriptions. After launch, users report that every generation looks nearly identical regardless of the prompt. What GAN training pathology does this describe?