Traditional Models for Computer Vision

Pairing Features with Models

Once features are extracted, they are fed into a standard machine learning model. Different models have different strengths depending on the feature types involved:

Support Vector Machines — Work particularly well with HOG and SIFT features. Effective in high-dimensional feature spaces and good with limited training data.
Random Forests — Handle varied feature types gracefully. You can combine color histograms with texture features without worrying about scaling. Built-in feature importance tells you which image properties are most useful.
k-Nearest Neighbors — Effective with normalized feature vectors and clear feature spaces like color distributions. Commonly used for image retrieval (find the most similar image).
Gradient Boosting (XGBoost) — Robust across combined, high-dimensional feature sets. Excellent when features come in different scales and types.

Why Traditional Approaches Have a Ceiling

Traditional CV has fundamental limitations that become more apparent as the problems get harder.

Feature engineering is bottlenecked by human expertise. Capturing all relevant properties of an image in hand-designed features is genuinely difficult, and the features that work for one domain often fail in another.
Traditional features lose spatial relationships between pixels. A color histogram tells you how much red is in an image, but not where. A GLCM captures local texture but cannot capture hierarchical structure across scales.
Limited generalizability. A feature set tuned for detecting pedestrians will likely perform poorly on detecting medical anomalies. Each new domain often requires re-engineering the features from scratch.
Hard to capture abstract patterns. Some visual properties that matter most for a task are not easily articulable as explicit features.

Checkpoint

A team uses HOG features and an SVM to build a pedestrian detector for outdoor security cameras. They want to adapt the same pipeline for detecting surgical instruments in operating room footage. What is the most likely outcome?

ℹ

Enter: Deep Learning

What if, instead of engineering features by hand, we let the model learn the features directly from data? That is the idea behind convolutional neural networks — and it is what we turn to next.

←PreviousFeature EngineeringTraditional Computer Vision Next→The ProblemConvolutional Neural Networks