| Abstract: |
Collective classification has been widely studied to predict class labels simultaneously for relational data, such as hyperlinked webpages, social networks, and data in a relational database. Most of the existing collective classification methods are usually expensive due to the iterative inference and their learning procedures based on iterative optimization. We proposes tacked graphical learning for efficient collective classification.
In stacked graphical learning, a base learner is first applied to the training data to make predictions using a cross validation-like technique. Then we expand the features by adding the predictions of relevant examples into the feature vector. Finally the base learner is applied to the expanded feature sets to make the final predictions.
We have applied stacked graphical learning to many real problems including collective classification, sequential partitioning, information extraction, and multi-task problems in an information extraction system. We also
formally analyze an idealized version of the algorithm, provide proof of convergence of the idealized version of stacking, and discuss the conditions under which the algorithm of stacked graphical learning is nearly identical to the idealized stacked graphical learning. An online version of stacked graphical learning is studied to save training time and to handle large streaming datasets with minimal memory overhead.
Thesis Committee:
William Cohen (Chair),
David Jensen (UMass),
Tom Mitchell,
Robert Murphy.
http://www.cs.cmu.edu/~woomy/thesis/thesis_kou.pdf
|