Stanford University CS 223-B Introduction to Computer Vision

Midterm Report

not yet submitted

Project P5:
Minimally supervised learning in computer vision

Project Goal

Cameras are improving in quality as they drop in price while computer power per $ is exponentially growing...yet the visual world is too big a place for carefully crafted models. We need methods that automate or partially automate the visual tasks that we want to perform. The goal of this project is to automatically find features and processing that will cluster objects (bonus: behaviors) into separate classes that may be labeled by hand later.

Figure 1: Complex visual world -- we have to automate the visual perception of it.
This project will use a single standard or high quality camera to collect video data of a chosen scene and automatically cluster the objects (and possible behaviors) that occur in that scene.

Project Scope

This project will collect large amounts of video data from a chosen scene, say the robotics lab itself, preprocess the data with a large range of feature extractors and then use feature selection techniques and unsupervised clustering to break the data up into (hopefully) meaningful clusters. The feature processing can take advantage of the OpenCV library. A decision must then be made to go with unsupervised clustering (agglomerative or spectral clustering for which the instructor may provide code) or, perhaps better, use a supervised clustering technique in a supervised manner such as discussed by Leo Breiman in his "Looking Inside the Black Box" lecture. Briefly, Brieman's method involves taking the original feature data as class 1, then scrambling the feature and taking the scrambled data as class 2. CART, boosted decision trees or Leo's Random Forests may then be used to tell the classes apart. If such can be done, one can iteratively apply feature selection to eliminate nuisance features. The structure of such a learned model may then be used to impose a distance metric between points and this may be used to both cluster (via spectral clustering) and visualize the data for labeling. The basic goal is to classify "things that change" (people, robots, moved chairs etc). Bonus goal is to make use of temporal information to classify behavior as well.

Tasks

The project will be accomplished through the following tasks. For speed and existing code, this project will use mostly C code.

Task 1: Data collection.

Task 2: Feature preprocessing. Will have to code some additional features, most exist in OpenCV.

Task 3: Small scale experiments to decide whether to make use of supervised or unsupervised. Initial code exists, but will have to be modified.

Task 4: Automatic clustering.

Task 5: Visualization, to analyze and label resulting clusters.

Task 6: Use of clusters to identify the objects (bonus: behaviors) in real time.
You will have assistance setting up data collection, storage and processing power in the robotics lab.

Project Status

Unclaimed.
Point of Contact

Gary Bradski

Project P5:Minimally supervised learning in computer vision

Project Goal

Project Scope

Tasks

Project Status

Point of Contact

Midterm Report

Project P5:
Minimally supervised learning in computer vision