Stanford University CS 223-B Introduction to Computer Vision

Independent Final Projects

IP01. Single View Corridor Reconstruction

Erick Delage, Honglak Lee
Staff contact: Hendrik

It is common to believe that human beings do not rely solely on stereovision for being able to estimate distances between objects in his environment. In fact, we are able to estimate very closely dimensions of things from a single photograph of a room or a landscape. In an attempt to understand and port these natural skills to a machine, this project will have for goal to demonstrate how a machine can be trained to precisely estimate the position of floor boundaries in an image of a hallway and with this information recover an accurate model of the walls and the obstacles present in the scene.

IP03. Age Classification

Charlie Stockman, Justin Durack
Staff contact: Dan

Facial detection has received a lot of attention because faces provide us with useful information (mood, gender, ethnicity, etc...). Extracting significant features and interpreting them is difficult. Some research has been done on age classification specifically, but classifiers appear to still be pretty elementary, successfully placing people into very broad age groups. As age and other demographic trait classification improves, application users could someday use it to improve user experience. In the meantime, a reasonable age guesser could make for a popular website.
The scope of this project is to implement the age classifier presented by Horng et al. and hopefully extend certain aspects of the classifier such as improving the actual classifying or making the face detection more robust.

IP04. Analysis of Parking Patterns

Kemal El Moujahid
Staff contact: Dan

Large Car-Parking companies operate buildings with thousand-cars capacity. At least three types of problem can be addressed using computer vision:
1. Indicate to users when they enter which spots are empty (or allocate spots): parking companies need to differentiate and being able to provide extra service to their customers is valuable to them.
2. Prevent users to park their cars in inappropriate spots: using two spots instead of one, using a forbidden spot.
3. Prevent cars from being stolen or broken into: human monitoring has its limits and is costly, a computer system that would assist human monitoring by alerting suspicious activity would be valuable.
The goal of this project is to solve problems 1-2 and attempt to solve 3.

IP05. Tracking of Multiple RC Cars

Mark Woodward
Staff contact: Dan

Remote control car racing is a competitive fast paced sport with races held every weekend in the bay area. In order to autonomously control an rc car in a race the car and the opposing cars around it need to be accurately localized. The goal of this project is to use a single pan/tilt/zoom camera to localize a primary car and those around it. Due to the large area of the track, only a small area is visible by the camera at any given time. Thus the cars will be moving in and out of the field of view. It would be nice to predict the location of these cars even when they are not in view, and to identify specific cars as they come back into view. Also, as the race is in progress "turn marshals" are constantly running around the track "righting" cars that have flipped. The system should be able to accurately predict vehicle locations even when turn marshals or other obstacles are occluding the vehicles.

IP06. Depth Estimation from Single Images

Jeff Michels, Ashutosh Saxena
Staff contact: Dan

As part of an ongoing project to control light weight autonomous vehicles, this project will attempt to learn an approximate range estimate to nearby obstacles from single mononcular images. Using relative range estimates from multiple regions of a given image, a simple control algorithm can be designed to avoid areas with obstacles nearby. A preliminary version of the algorithm has been used to drive a small remote control car. The ultimate goal will be to use it to fly a fixed-wing aircraft low to the ground while avoiding trees.
The project will involve using machine learning on images from a variety of sources to learn a metric of "distance to nearest obstacle" over parts of the image. Training images can come from correlated range and vision data or from graphically rendered artificial images. In order to improve the performance of the learning algorithm, simple image features of local regions must be computed, keeping in mind the constraint that the whole algorihm run in real time. The feature regions include but not limited to Laws' masks and other basic filters.

IP07. Mineral identification from MER Pancam images

Mario Parente, Fernando Amat
Staff contact: Dan

Mars Exploration Rovers (MER) is a NASA mission to assess the morphology, topography and geologic context of Mars surface. Another objective is to obtain multispectral visible to short-wave near_IR images of selected regions to determine surface color and mineralogic properties. Pancam is a camera pair. Each camera is composed by a set of filters at different wavelengths. Images from the rovers are available on the web at http://anserver1.eprsl.wustl.edu/ . Camera parameters, image format description and calibration processing is described in [1]. The goal of the project is to automatically identify the mineralogic content of the objects in the scenes through segmentation of the images and subsequent statistical learning of the multispectral probability distribution of the pixels of each isolated region.

IP08. Real-Time Feature-Based Mosaicking

Kiran Murthy
Staff contact: Dan

The Aerospace Robotics Lab (ARL) is pursuing an ocean floor mosaicking project in conjunction with the Monterrey Bay Aquatic Research Institute (MBARI). MBARI's tethered Remotely Operated Vehicle (ROV) currently has the capability to use images from its downward-looking camera to construct real-time mosaics of the ocean floor. However, the image mosaicking is imperfect, as the algorithm does not compensate for scale or rotation invariance caused by changes in the ROV's altitude and heading. As a result, the ARL desires a more precise image mosaicking algorithm that compensates for differences in image scale and rotation. The ARL plans to run the more precise algorithm in conjunction with the less precise algorithm. As such, a rough mosaic can be constructed in real-time, while in the background, a more precise mosaic is constantly replacing the rough mosaic.
This vision project will attempt to use SIFT feature algorithms to construct the precise real-time mosaic. If time permits, the project will investigate using a combination SIFT features and ROV altitude and heading metadata in order to construct the mosaic.
Precise real-time feature-based mosaics have many applications in user interface design and robotic navigation. In the immediate future, the ARL hopes to use mosaicking in their user interface so that the ROV's remote pilot always knows the craft's position relative to the path that the ROV has taken through the ocean. In terms of robotic navigation, it may be possible to incorporate the features found by real-time mosaicking with Simultaneous Localization and Mapping (SLAM) algorithms in order to assist a robot in autonomously navigating terrain by remembering environmental features along its path.