Introduction to KDD-UAI Joint Sessions

Usama Fayyad and Eric Horvitz, Microsoft Research

The last several decades have been a fertile time for the birth and evolution of a variety of interrelated disciplines centered on problems and opportunities in computer-based inference and analysis. Several conferences and corresponding research communities have focused on the role of computational methods in uncertainty and statistics. Although the conceptual centers of these conferences may be distinct, it is not uncommon to see significant overlap and synergistic relationships among sets of contributions at the different conferences. Such conceptual overlap and synergy is exemplified by the relationships between selected contributions at the Conference on Uncertainty in Artificial Intelligence (UAI) and at the International Conference on Knowledge Discovery and Data Mining (KDD).

The Conference on Uncertainty in Artificial Intelligence (UAI) was founded eleven years ago by investigators with a passion for solving challenging problems in computer-based inference, decision making, and learning, with a focus on extending and applying principled methods for reasoning under uncertainty. Over the last decade, the UAI conference, and its corresponding community, have addressed fundamental computational and representational problems with reasoning under uncertainty, and have contributed a wide array of theoretical and engineering solutions to difficult challenges. At the center of a great deal of UAI research are graphical representations of probabilistic dependencies. Although work focusing on the links between graphical models of probabilistic dependencies and problems of learning from data have been represented at almost all UAI conferences, there has been a remarkable increase in the number of contributions in this area over the last three years.

In 1989, the first KDD meeting took place as a small workshop at IJCAI-89. The workshops continued to meet on a biannual basis up until 1993. Following a large meeting in KDD-94, the limited attendance workshop format was changed to an open attendance conference in Montreal, collocated with IJCAI-95. KDD-95 attracted over 350 attendees and was an exciting and successful initial kickoff conference. Key contributors in learning and graphical models from the UAI community played a role in the first KDD conference. As the KDD and UAI conferences have overlapped in timing, several people with interest in both conferences found themselves traveling back and forth to attend sessions at the two conferences.

The KDD conference was established as the key meeting of researchers and engineers pursuing techniques for discovering useful knowledge in large databases. Investigators in this area have been passionate about developing a new generation of computational techniques and tools to assist people to extract knowledge from the rapidly growing volumes of data. Clearly, our ability to gather and accumulate data has progressed far beyond our ability to analyze it and to transform it into useful information. We use the term KDD to refer to the multi-step and iterative process consisting of data selection/access methods, data cleaning, choice of representation, extraction of patterns/models (data mining), and interpretation, evaluation, and visualization of the patters/structures derived from the data store. Hence data mining consists of a step within the KDD process. Hence data mining by necessity includes both principled and heuristic approaches to learning from data drawn from a multitude of communities including: statistics, databases, pattern recognition, machine learning, and high-performance/parallel computing.

The intersection of UAI and KDD centers on the opportunity for developing better theories and methods for data mining based in the foundations of probability and the use of graphical representations of probabilistic dependencies. A central problem in KDD is one of statistical inference and the management of uncertainty. Although large data volumes can go a long way towards enabling the inference and verification of fairly complex models from data, the curse of dimensionality insures that model extraction remains a difficult challenge. In addition, finding the right models typically requires a huge amount of search. One of the most effective ways of dealing with high dimensionality and problems of large search spaces is to employ prior knowledge of the underlying data-generating process. Coherent techniques for probabilistic inference under uncertainty, for encoding of prior knowledge, and for efficiently representing probability distributions have been a primary focus of the UAI literature on learning. At the intersection between UAI and KDD lies a wealth of rich research problems that span the spectrum from theoretical issues to detailed applied techniques for statistical inference and reasoning with knowledge.

Each paper presented at the joint session underwent peer review by the program committee of the home conference to which it was submitted. Papers were selected for the joint sessions only after final decisions of acceptance and presentation were made independently by the program chairs and program committees of each conference. Papers were selected from the KDD-96 and UAI-96 proceedings based on several criteria including the general interest of the work to both communities, the consideration of problems in learning models from data, and the focus on probabilistic methods. The joint sessions are only being presented in plenary format at the joint sessions, and the sessions are an integral component of both the UAI and the KDD conferences.

The papers appearing in the joint session can be accessed in the corresponding proceedings and should only be referenced and cited as appearing in their home proceedings. Pointers to the full papers presented at the joint sessions have been coalesced on a www page at http://cuai-96.microsoft.com/kdd-uai.

It is our hope that this joint session will serve to strengthen links between the KDD and UAI communities and highlight the strong overlap in interests and applicability of results from the communities. We hope that members of each community with interests in learning and data mining will find it easy to attend and contribute at future UAI and KDD conferences.

We would like to thank the American Association for Artificial Intelligence (AAAI), the Association for Uncertainty in Artificial Intelligence (AUAI), Morgan Kaufmann Publishers, Steve Hanks, Finn Jensen, Evangelos Simoudis, and Jiawei Han, and the sponsors of both conferences for making this special joint session possible.

Usama Fayyad and Eric Horvitz
Organizers
KDD-UAI Special Joint Sessions