Fall 2008 Seminar

 
 
 
       
  Machine Learning Seminar Series
 
 
  Seminar Schedule (Seminar Organizer: Prof. Ziv Bar-Joseph)
 

 

ML/Google Seminars

Machine Learning Lunch Seminar

 

 

Date: December 4, 2008
Time: 4:00 PM - 5:30 PM
Location: 4625 Wean Hall
Speaker: Hanghang Tong Ph.D. Candidate
Title: Tools and Algorithms for Querying and Mining Large Graphs
Abstract: Graphs appear in a wide range of settings, such as computer networks, the world wide web, biological networks, social networks (MSN/FaceBook/LinkedIn) and many more. How can we find user-specific patterns (e.g., master mind, money laundry ring) from such graphs? How can we spot anomaly in a dynamic and intuitive way? How can we find the communities with optional constrains? How can we mine time/space in the complex context? In this thesis, we focus on two types of tasks according to the interaction with users: (1) querying and (2) mining. For the task of querying, we have mainly studied the following three sub tasks. First, we focus on how to find complex user-specific patterns from large graphs, where we have addressed three applications: (1) Center-piece subgraph discovery for plain graphs; (2) Best effort pattern match for attributed graphs; and (3) Querying with feedback. Then, we focus on (1) how to predict the direction of the link; and (2) how to address the temporal issue in querying. Finally, since the main tool for querying large graphs is the proximity measurement, we have designed a family of fast solutions (FastProx) in several different settings, which often achieves up to 2 orders of magnitude speedup without or with very little quality loss. For the task of mining, we have studied the following two sub tasks. First, we have proposed a family of example-based low-rank approximation methods (Colibri) for anomaly detection. It can work for both static graphs and dynamic graphs. It achieves significant speedups and space saving over the existing methods without quality loss. Then, we have designed (T3) to mine temporal information in the context of graphs, which can find similar time stamps as well as abnormal time stamps; and also provide the interpretations for our findings. Furthermore, we have proposed (MT3) to speedup the multiple resolution analysis. Future work includes two aspects. First, we would like to design tools to detect communities from graphs with optional constrains. Then, we will study how to mine the spatial information in the context of graphs. Besides, we plan to investigate diffusion wavelet as an alternative way for querying and mining large graphs. Thesis Committee: Christos Faloutsos, Chair, William Cohen, Jeff Schneider, Philip S. Yu, UIC (University of Illinois at Chicago). Proposal Document: www.cs.cmu.edu/~htong/work/thesis/proposal-tong.pdf