AutoPano: Image Stitching

AutoPano is an automated panoramic image stitching system developed as part of a geometric computer vision course at Worcester Polytechnic Institute. The system merges multiple overlapping photographs into seamless wide-angle composites through two implementations: a classical computer vision pipeline (Phase 1) and a deep learning-based approach (Phase 2).

Phase 1: Classical Pipeline

The classical pipeline processes image pairs through six sequential stages, from raw input to stitched panorama.

Step 1: Corner Detection

Salient keypoints are identified using the Harris and Shi-Tomasi corner detection algorithms, locating geometrically distinctive regions across each image.

Step 2: Adaptive Non-Maximal Suppression (ANMS)

Raw corner responses are filtered using ANMS to retain only the most distinctive corners while enforcing even spatial distribution — preventing feature clustering near high-gradient regions.

Step 3: Feature Descriptor Extraction

An 8×8 patch-based descriptor is extracted around each retained corner. Patches are Gaussian-smoothed and normalized to produce robust, comparison-ready feature vectors.

Step 4: Feature Matching

Correspondences between image pairs are established via ratio-test validation, filtering ambiguous matches by comparing distances to the best and second-best candidate descriptors.

Step 5: RANSAC Homography Estimation

A robust homography matrix is computed using RANSAC (Random Sample Consensus), which iteratively identifies the largest set of inlier matches while rejecting outliers caused by noise or mismatches.

Step 6: Image Warping & Final Panorama

The estimated homography warps images into a common coordinate frame. Blending resolves seams at overlap boundaries to produce the final seamless panorama.

Phase 2: Deep Learning Approach

Phase 2 replaces the hand-crafted feature pipeline with a neural network trained to directly predict the homography between image pairs. Rather than detecting and matching keypoints explicitly, the network learns a mapping from image patches to the 4-point parametrization of the geometric transformation, enabling end-to-end differentiable training.

View Code on GitHub