Chapter 3

Variational Autoencoders

An autoencoder compresses data to a latent vector and reconstructs it — but nothing forces the latent space to be well-organized, making random sampling produce garbage. VAEs fix this by encoding inputs as distributions rather than points, and by adding KL divergence as a regularization term that shapes the latent space to be continuous and complete. This chapter builds from the vanilla autoencoder's failure mode to the full VAE architecture, the two-term loss function, and the real-world applications where VAEs outshine GANs: anomaly detection, controllable generation, and drug discovery.