Seeing Is (Much) Harder Than It Looks

Here is a thought experiment: walk into any room and glance around. In a fraction of a second, you have identified where the furniture is, recognized a face, estimated which direction the light is coming from, and assessed whether there is anything dangerous nearby. Your brain did all of that without any conscious effort whatsoever.

Now try to teach a computer to do it.

Computer vision is the field of deep learning — and before that, classical machine learning — that gives machines the ability to interpret and understand visual information from the world. When we say "computer vision," we mean anything that involves images: still photos, video frames, medical scans, satellite imagery, microscope feeds, and everything in between.

That scope is enormous. And the applications have quietly embedded themselves into everyday life in ways most people never notice.

ℹ

What This Unit Covers

This unit builds a complete practitioner's foundation in computer vision. We start with the intuition (why does this exist? what problem does it solve?), move into the mechanics (how does it actually work?), and land on application (where would you use this in the real world?).

Each technique covered here has already been deployed somewhere — in hospitals, on factory floors, in your phone's camera. Knowing where the field lives outside of textbooks will make you a better practitioner.

Next→What We Can Do With Computer VisionIntroduction to Computer Vision