Homepage
Research
Students
Courses
Robots
Papers
Videos
Press
Talks
Faq
CV
Lab
Travel
Contact
Personal
Links


Automated Learning and Discovery: State-Of-The-Art and Research Topics in a Rapidly Growing Field

Sebastian Thrun, Christos Faloutsos, Tom Mitchell, Larry Wasserman

The field of automated learning and discovery---often called data mining, machine learning, or advanced data analysis---is currently undergoing a major change. The progressing computerization of professional and private life, paired with a sharp increase in memory, processing and networking capabilities of today's computers, make it increasingly possible to gather and analyze vast amounts of data. For the first time, people all around the world are connected to each other electronically through the Internet, making available huge amounts of online data at an exponential rate.

Sparked by these innovations, we are currently witnessing a rapid growth of a new industry, called the data mining industry. Companies and governments have begun to realize the power of computer-automated tools for systematically gathering and analyzing data. For example, medical institutions have begun to utilize data-driven decision tools for diagnostic and prognostic purposes; various financial companies have begun to analyze their customers' behavior in order to maximize the effectiveness of marketing efforts; the Government now routinely applies data mining techniques to discover national threats and patterns of illegal activities in intelligence databases; and an increasing number of factories apply automatic learning methods to optimize process control. These examples illustrate the immense societal importance of the field.

At the same time, we are witnessing a healthy increase in research activities on issues related to automated learning and discovery. Recent research has led to revolutionary progress, both in the type methods that are available, and in the understanding of their characteristics. While the broad topic of automated learning and discovery is inherently cross-disciplinary in nature---it falls right into the intersection of disciplines like statistics, computer science, cognitive psychology, robotics, and it users such as medicine, social sciences, and public policy---these fields have mostly studied this topic in isolation. So where is the field, and where is it going? What are the most promising research directions? What are opportunities of cross-cutting research, and what is worth pursuing?

The full paper is available in gzipped Postscript and PDF