UAI 2004 Invited Talks


 

Cascading Behavior and Bursty Dynamics in Computational Models of Social Networks

 

 

Jon Kleinberg

Department of Computer Science

Cornell University

Home Page: http://www.cs.cornell.edu/home/kleinber/

 

 

Probabilistic models for the processes by which ideas and influence propagate through a social network have been studied in a number of domains, including the diffusion of scientific and technological innovations, the sudden and widespread adoption of various strategies in game-theoretic settings, and the effects of ``word of mouth'' in the promotion of new products. Emerging on-line media such as Weblogs offer a compelling new setting in which to study such phenomena, since they are characterized by frequent ``topic bursts'' in which a virtual discussion spreads rapidly through the network of individuals who author on-line content.

A natural way to identify highly influential individuals in such network settings is to look for collections of nodes with the power to trigger large outbreaks of cascading behavior. In other words, if we can try to convince a subset of individuals to adopt a new behavior (e.g. to try a new product or take part in an on-line discussion), and the goal is to cause a large cascade of further adoptions, which set of individuals should we target?


In this talk, we will consider such problems in several of the most widely studied models in social network analysis, and describe algorithms (obtained in joint work with David Kempe and Eva Tardos) with provable approximation guarantees for the problem of selecting the most influential set of nodes. The development of these algorithms indicates how influential sets must include members from diverse parts of the network, forcing heuristics for this problem to deal, at least implicitly, with the clustering of the underlying population. We conclude by discussing the intriguing relationship between models of influence propagation and the problem of identifying ``bursty'' topics, which is emerging as an important issue for searching on-line information resources.
 

 


 

What is the matter? Explorations in text categorization

 

   

Lillian Lee

Department of Computer Science

Cornell University

Home Page:

http://www.cs.cornell.edu/home/llee/

 

 

Human language is rife with ambiguity, both intentional and unwitting, which means that systems for analyzing human-authored documents must reason under uncertainty as a matter of course. Even the presumably easier problem of text categorization, construed broadly to refer to the grouping of documents that are somehow similar in content, remains a challenge. Nonetheless, this talk will discuss work on two text-categorization problems in which new algorithms lacking access to prior linguistic knowledge perform surprisingly well.

We first present Iterative Residual Rescaling, a subspace-projection method that generalizes and empirically outperforms Latent Semantic Indexing (LSI) in representing inter-document topic-based similarity measurements. The new algorithm is motivated by analysis employing matrix perturbation theory that reveals a precise relationship between LSI's performance and the uniformity of the document set's underlying topic distribution.

We then address a different problem that has attracted a great deal of attention recently: that of sentiment classification, an example of which is determining whether a movie review is "thumbs up" or "thumbs down". We demonstrate that new techniques based on finding minimum cuts in graphs yield state-of-the-art results.

Joint work with Rie Kubota Ando and with Bo Pang.

 


 

Good-Turing estimation and its applications

 

 

Alon Orlitsky

Departments of ECE and CSE

University of California at San Diego

Home Page:

 http://kodiak.ucsd.edu/alon/

While deciphering the Enigma Code during World War II, I.J. Good and A.M. Turing derived a surprising and unintuitive formula for estimating a probability distribution from a sample of data. The eponymous Good-Turing Estimator has since been used in a variety of applications and studied by a number of researchers. We define a simple measure for the performance of an estimator and use it to show that the Good-Turing estimator is good but not optimal and to derive an asymptotically optimal estimator. We outline three applications of such estimators.

Joint work with N.P. Santhanam, K. Viswanathan and J. Zhang.


Markov Processes and Markov Decision Processes: The Verification Perspective

 

Moshe Y. Vardi

Department of Computer Science

Rice University

Home Page:

http://www.cs.rice.edu/~vardi/

In property-based verification one uses algorithmic techniques to establish the correctness of the design with respect to a given property. The verification technique of model checking is based on a small number of key algorithmic ideas, tying together graph theory, automata theory, and logic. In the past 20 years, model checking has been applied successfully to the analysis of protocols and circuits, increasingly becoming a standard industrial practice. In this talk I will describe how model checking can be adapted to Markov Processes and Markov Decision Processes, offering a property-based approach to the analysis of these models.