Stanford University CS 223-B Introduction to Computer Vision

Project P9:
Improve Scale Invariant Feature Transform (SIFT)

Project Goal

The idea of using local oriented features over different visual scales has shown itself to be perhaps the most effective visual primitive for object recognition, robot localization, video retrieval and recently stitching images into panoramas. The first effective use of this idea was Christoph von der Malsburg's use of oriented Gabor filters over different scales linked in a graph. Recently, David Lowe has improved on the core idea by finding stable oriented features that indicate their scale (~depth) with his Scale Invariant Feature Transform (SIFT). SIFT is computationally efficient and has allowed real advances in 3D object recognition, robot localization, and stitching panoramas together. Figure 1 shows oriented SIFT features used for identifying 3D objects in clutter. The problem with SIFT is that the algorithm is not crisply defined, has lots of free parameters and there's no source code available to show how it's really implemented. The goal of this project is an improved version of SIFT features first by simplifying and cleaning up the algorithm; then by improving it's robustness to illumination and improving the SIFT keys as judged by recognition accuracy in outdoor scenes.

Figure 1: Object recognition with SIFT features*.

Project Scope

Will use the existing Matlab SIFT code developed at Intel as a basis for cleaning up SIFT. Several cameras will be used to collect outdoor and indoor data of a set of objects. Stability of points to illumination will be improved and stability across different cameras. A SIFT recognition key will be developed (missing in the Matlab code) subject to maximizing recognition scores under varying illumination and cameras.
I want to emphasize that I feel these features are a fundamental advance in vision capabilities -- success in this project will have high practical impact on the field.

Tasks

The project will be accomplished through the following tasks. Task 6 is basic project completion, the further the team gets beyond this, the better.

Task 1: Download the SIFT papers zip file (7M) and read all of them except SIFT06M*.

Task 2: Download run and understand Matlab SIFT code (0.3M) (it doesn't include recognition keys).
Task 3: Debug this code for small and large images.
Task 4: Create an image database: Capture images of 10 different objects under different poses, occlusions and indoor and outdoor lighting at different times of day using cheap and better still and video cameras. Also collect scenes that don't contain the object.

Task 5: Create & use a Matlab app for image study: Run the SIFT algorithm on these images to characterize the stability of feature points over views, illumination and cameras.

Task 6: Improve lighting robustness by normalizing potential SIFT Difference of Gaussian (DOG) points with a Sum of Gaussian (SOG)

Task 7: Implement Lowe's gradient histogram SIFT feature keys.
Task 8: Implement the Affine warped Hough transform recognition technique and collect recognition scores under varying illumination, pose and cameras.

Task 9: Simplify the SIFT key by using a 3x3 gradient histogram with overlapping bins. Test recognition scores and tweek for best scores.

If time or afterwards, implement in efficient C code -- large extra credit for this and probable inclusion in OpenCV.

Pre-requisites

Solid Matlab coders, good intuition for recognition algorithms.

Project Contact

Project Status

Michael Turitzin Anthony Hui, and Christer Gustavsson

Midterm Report

submitted

Final Report

submitted

* David G. Lowe, "Object Recognition from Local Scale-Invariant Features", ICCV'99

Project P9:Improve Scale Invariant Feature Transform (SIFT)

Project Goal

Project Scope

Tasks

Pre-requisites

Project Contact

Project Status

Midterm Report

Final Report

Project P9:
Improve Scale Invariant Feature Transform (SIFT)