Introduction to KDD-UAI Joint Sessions
Usama Fayyad and Eric Horvitz, Microsoft
Research
The last several decades have been a fertile time
for the birth and evolution of a variety of interrelated disciplines
centered on problems and opportunities in computer-based inference
and analysis. Several conferences and corresponding research communities
have focused on the role of computational methods in uncertainty
and statistics. Although the conceptual centers of these conferences
may be distinct, it is not uncommon to see significant overlap
and synergistic relationships among sets of contributions at the
different conferences. Such conceptual overlap and synergy is
exemplified by the relationships between selected contributions
at the Conference on Uncertainty in Artificial Intelligence (UAI)
and at the International Conference on Knowledge Discovery and
Data Mining (KDD).
The Conference on Uncertainty in Artificial Intelligence
(UAI) was founded eleven years ago by investigators with a passion
for solving challenging problems in computer-based inference,
decision making, and learning, with a focus on extending and applying
principled methods for reasoning under uncertainty. Over the last
decade, the UAI conference, and its corresponding community, have
addressed fundamental computational and representational problems
with reasoning under uncertainty, and have contributed a wide
array of theoretical and engineering solutions to difficult challenges.
At the center of a great deal of UAI research are graphical representations
of probabilistic dependencies. Although work focusing on the links
between graphical models of probabilistic dependencies and problems
of learning from data have been represented at almost all UAI
conferences, there has been a remarkable increase in the number
of contributions in this area over the last three years.
In 1989, the first KDD meeting took place as a small
workshop at IJCAI-89. The workshops continued to meet on a biannual
basis up until 1993. Following a large meeting in KDD-94,
the limited attendance workshop format was changed to an open
attendance conference in Montreal, collocated with IJCAI-95. KDD-95
attracted over 350 attendees and was an exciting and successful
initial kickoff conference. Key contributors in learning and graphical
models from the UAI community played a role in the first KDD conference.
As the KDD and UAI conferences have overlapped in timing, several
people with interest in both conferences found themselves traveling
back and forth to attend sessions at the two conferences.
The KDD conference was established as the key meeting
of researchers and engineers pursuing techniques for discovering
useful knowledge in large databases. Investigators in this area
have been passionate about developing a new generation of computational
techniques and tools to assist people to extract knowledge from
the rapidly growing volumes of data. Clearly, our ability to gather
and accumulate data has progressed far beyond our ability to analyze
it and to transform it into useful information. We use the term
KDD to refer to the multi-step and iterative process consisting
of data selection/access methods, data cleaning, choice of representation,
extraction of patterns/models (data mining), and interpretation,
evaluation, and visualization of the patters/structures derived
from the data store. Hence data mining consists of a step
within the KDD process. Hence data mining by necessity
includes both principled and heuristic approaches to learning
from data drawn from a multitude of communities including: statistics,
databases, pattern recognition, machine learning, and high-performance/parallel
computing.
The intersection of UAI and KDD centers on the opportunity
for developing better theories and methods for data mining based
in the foundations of probability and the use of graphical representations
of probabilistic dependencies. A central problem in KDD is one
of statistical inference and the management of uncertainty. Although
large data volumes can go a long way towards enabling the inference
and verification of fairly complex models from data, the curse
of dimensionality insures that model extraction remains a difficult
challenge. In addition, finding the right models typically requires
a huge amount of search. One of the most effective ways of dealing
with high dimensionality and problems of large search spaces is
to employ prior knowledge of the underlying data-generating process.
Coherent techniques for probabilistic inference under uncertainty,
for encoding of prior knowledge, and for efficiently representing
probability distributions have been a primary focus of the UAI
literature on learning. At the intersection between UAI and KDD
lies a wealth of rich research problems that span the spectrum
from theoretical issues to detailed applied techniques for statistical
inference and reasoning with knowledge.
Each paper presented at the joint session underwent
peer review by the program committee of the home conference to
which it was submitted. Papers were selected for the joint sessions
only after final decisions of acceptance and presentation were
made independently by the program chairs and program committees
of each conference. Papers were selected from the KDD-96 and UAI-96
proceedings based on several criteria including the general interest
of the work to both communities, the consideration of problems
in learning models from data, and the focus on probabilistic methods.
The joint sessions are only being presented in plenary format
at the joint sessions, and the sessions are an integral component
of both the UAI and the KDD conferences.
The papers appearing in the joint session can be
accessed in the corresponding proceedings and should only be referenced
and cited as appearing in their home proceedings. Pointers to
the full papers presented at the joint sessions have been coalesced
on a www page at http://cuai-96.microsoft.com/kdd-uai.
It is our hope that this joint session will serve
to strengthen links between the KDD and UAI communities and highlight
the strong overlap in interests and applicability of results from
the communities. We hope that members of each community with interests
in learning and data mining will find it easy to attend and contribute
at future UAI and KDD conferences.
We would like to thank the American Association for Artificial Intelligence (AAAI), the Association for Uncertainty in Artificial Intelligence (AUAI),
Morgan Kaufmann Publishers,
Steve Hanks, Finn Jensen, Evangelos Simoudis, and Jiawei Han,
and the sponsors of both conferences for making this special joint
session possible.
Usama Fayyad and Eric Horvitz
Organizers
KDD-UAI Special Joint Sessions