A Peek Inside a CNN

What Does a CNN Actually See?
We've described CNNs in terms of operations — convolution, pooling, weight sharing. But what do those learned weights actually represent?
Unlike the weights of a fully connected network, CNN filters are small images we can literally look at. And when we do, something striking emerges: the hierarchy a CNN learns on its own closely mirrors the one neuroscientists found in the mammalian visual cortex. Early layers detect edges and colors. Middle layers detect textures and shapes. Deep layers detect objects and faces.
Nobody programmed this hierarchy. Gradient descent discovered it because it's the most efficient way to describe natural images.

Adam Harley
Visualize a Neural Network
An interactive 3D visualization of a CNN processing handwritten digits in real time.
Watch activations flow through each layer of a CNN in real time.
Explainability
A trained CNN can classify images with superhuman accuracy — but can we trust it? Does it understand what it's looking at, or is it exploiting patterns we haven't noticed? Explainability is the field of techniques for opening that black box. Two of these are feature visualization and saliency maps.
Feature Visualization
Feature visualization asks: what input maximally activates a given neuron? By generating images that make a filter fire as strongly as possible, we get a direct readout of what concept it has learned to detect. This is how researchers confirmed the edge → texture → object hierarchy — not by inspecting weights, but by visualizing what each filter responds to most.

Distill
Feature Visualization
Feature visualization allows us to see how GoogLeNet builds up its understanding of images over many layers.
A deep dive into feature visualization techniques applied to GoogLeNet.
Saliency Maps
Saliency maps flip the question: given a real image and a prediction, which pixels drove that decision? One approach computes the gradient of the predicted class score with respect to each input pixel — pixels that would most change the output if altered are highlighted. The result is a heatmap overlaid on the image showing where the network was "looking." Saliency maps are a great tool for debugging your models.
A CNN trained only to classify images spontaneously learns edge detectors, texture detectors, and object-part detectors — the same features neuroscientists found in the visual cortex. What does this suggest about how learning algorithms relate to the problems they're trained on?