A Peek Inside a CNN

Features from VGG19 Layer 14
Features from VGG19 Layer 14. Notice the curves and arcs — building blocks that appear in nearly every object. Remarkably, researchers have found that ALL large CNN-based image classifiers learn curve detectors like these.

What Does a CNN Actually See?

We've described CNNs in terms of operations — convolution, pooling, weight sharing. But what do those learned weights actually represent?

Unlike the weights of a fully connected network, CNN filters are small images we can literally look at. And when we do, something striking emerges: the hierarchy a CNN learns on its own closely mirrors the one neuroscientists found in the mammalian visual cortex. Early layers detect edges and colors. Middle layers detect textures and shapes. Deep layers detect objects and faces.

Nobody programmed this hierarchy. Gradient descent discovered it because it's the most efficient way to describe natural images.

Explainability

A trained CNN can classify images with superhuman accuracy — but can we trust it? Does it understand what it's looking at, or is it exploiting patterns we haven't noticed? Explainability is the field of techniques for opening that black box. Two of these are feature visualization and saliency maps.

Feature Visualization

Feature visualization asks: what input maximally activates a given neuron? By generating images that make a filter fire as strongly as possible, we get a direct readout of what concept it has learned to detect. This is how researchers confirmed the edge → texture → object hierarchy — not by inspecting weights, but by visualizing what each filter responds to most.

Saliency Maps

Saliency maps flip the question: given a real image and a prediction, which pixels drove that decision? One approach computes the gradient of the predicted class score with respect to each input pixel — pixels that would most change the output if altered are highlighted. The result is a heatmap overlaid on the image showing where the network was "looking." Saliency maps are a great tool for debugging your models.

💭Reflection

A CNN trained only to classify images spontaneously learns edge detectors, texture detectors, and object-part detectors — the same features neuroscientists found in the visual cortex. What does this suggest about how learning algorithms relate to the problems they're trained on?