| Date: | March 23, 2006 |
| Time: | 2:30 PM - 3:30 PM |
| Location: | 1507 Newell-Simon Hall |
| Speaker: |
Andrew McCallum Associate Professor, University of Massachusetts Amherst |
| Title: | Topic Models for Social Network Analysis and Bibliometrics |
| Abstract: |
Topic models, such as Latent Dirichlet Allocation and its progeny, are increasingly popular tools for summarization and knowledge discovery in text and other discrete data. This talk will present several new generative topic models that combine unstructured text with structured data, such as links, relations, time-stamps, and n-gram sequences. I will demonstrate these methods' capabilities in enabling role and group discovery in social network data, and enabling new bibliometric
impact measures mined from over 1 million research papers gathered by our new web portal, Rexa.info. Finally, I will briefly introduce very recent work in multi-conditional mixtures---alternative topic models that have some similarities to conditional random fields.
Joint work with colleagues at UMass: Xuerui Wang, Natasha Mohanty, Andres Corada, Chris Pal, Wei Li, David Mimno and Gideon Mann. |
| Speaker Bio: |
Andrew McCallum is an Associate Professor at University of
Massachusetts, Amherst. He was previously Vice President of Research
and Development at WhizBang Labs, a company that used machine learning
for information extraction from the Web. In the late 1990's he was a
Research Scientist and Coordinator at Justsystem Pittsburgh Research
Center, where he spearheaded the creation of CORA, an early research
paper search engine that used machine learning for spidering,
extraction, classification and citation analysis. He was a
post-doctoral fellow at Carnegie Mellon University after receiving his
PhD from the University of Rochester in 1995. He is an action editor
for the Journal of Machine Learning Research. For the past ten years, McCallum has been active in research on statistical machine learning applied to text, especially information extraction, document classification, clustering, finite state models, semi-supervised learning, and social network analysis. Web page: http://www.cs.umass.edu/~mccallum. |