Foreword by Tom M. Mitchell
Over the past thirty years the field of Machine Learning has developed a
sequence of increasingly successful paradigms for automatically learning
general laws from specific training data. Algorithms for learning neural
networks and decision trees are now in widespread use in datamining
applications such a learning to detect credit card fraud, in control
applications such as optimizing manufacturing processes, and in sensor
interpretation tasks such as learning to recognize human speech and human
faces. While these algorithms demonstrate the practical importance of machine
learning methods, researchers are actively pursuing yet more effective
algorithms.
This manuscript describes research aimed at a new generation of machine
learning methods -- methods that enable the computer to learn more
accurately from less training data. The key to this new approach is to
take advantage of other previously acquired knowledge. To see the idea,
consider a mobile robot or process control system that must learn a control
strategy to achieve a new type of goal (e.g., locating a new type of
object) in a familiar environment (e.g., the building in which it has
operated for some time). Because the robot has experience in this
environment, it is likely to have previously acquired data or knowledge
that can be helpful in learning the new task. It might, for example, have
learned to predict the approximate effect of various robotic actions on
subsequent sensor input. The Explanation-Based Neural Network (EBNN)
learning algorithm presented here takes advantage of such prior knowledge,
even if it is inexact, to significantly improve accuracy for the new
learning task. Whereas earlier methods such as neural network and decision
tree induction make use only of the training data for the current learning
task, this monograph explores several settings in which previous experience
in related tasks can be used to successfully bootstrap new learning.
While the specific EBNN learning algorithm presented here is interesting
for its ability to use approximate prior knowledge to improve learning
accuracy, the significance of this paradigm goes beyond this particular
algorithm. The paradigm of lifelong learning -- using earlier learned
knowledge to improve subsequent learning -- is a promising direction for a
new generation of machine learning algorithms. Whereas recent theoretical
results have shown fundamental bounds on the learning accuracy achievable
with pure induction from input-output examples of the target function, the
lifelong learning paradigm provides a new setting in which these
theoretical bounds are sidestepped by the introduction of knowledge
accumulated over a series of learning tasks. While it is too early to
determine the eventual outcome of this line of research, it is an exciting
and promising attempt to confront the issue of scaling up machine learning
algorithms to more complex problems. Given the need for more accurate
learning methods, it is difficult to imagine a future for machine learning
that does not include this paradigm.
Tom M. Mitchell
Pittsburgh
November 1995