Spring 2006 Seminar

 
 
       
  Machine Learning Seminar Series
 
 
  Seminar Schedule (Seminar Organizer: Prof. Ziv Bar-Joseph)
 

 

ML/Google Seminars

Machine Learning Lunchtime Chats

 

 

Date: April 12, 2006
Time: 3:00 PM - 4:00 PM
Location: 1305 Newell-Simon Hall
Speaker: Katherine A. Heller Research Student, Gatsby Computational Neuroscience Unit
Title: A Bayesian Approach to Information Retrieval Using Sets of Items
Abstract: Inspired by Google Sets, we consider the problem of retrieving items from a concept or cluster, given a query consisting of a few items from that cluster. We formulate this as a Bayesian inference problem and describe a very simple algorithm for solving it. Our algorithm uses a model-based concept of a cluster and ranks items using a score which evaluates the marginal probability that each item belongs to a cluster containing the query items. We focus on sparse binary data and show that our score can be evaluated exactly using a single sparse matrix multiplication, making it possible to apply our algorithm to very large datasets. We evaluate our algorithm on such tasks as retrieving movies from the Each Movie dataset, finding completions of author sets from the NIPS dataset, and finding completions of sets of words appearing in the Grolier encyclopedia. We also use our algorithm to develop a new system for performing content-based image retrieval, and evaluate it using a Corel image database of 32,000 images.