| Abstract: |
In active learning, a learning algorithm is given access to a large
pool of unlabeled examples, and is allowed to request the label of any
particular examples from that pool, interactively. The objective is to
learn a function that accurately predicts the labels of new examples,
while requesting as few labels as possible. Active learning can often
significantly decrease the work load of human annotators, compared to
passive learning where the examples to be labeled are chosen at
random. This is of particular interest for learning tasks where
unlabeled examples are available in abundance, but label information
requires significant effort to obtain.
In the passive learning literature, there are well-known bounds on the
rates of convergence of the loss of certain estimators, as a function
of the number of labeled examples observed. However, significantly
less is presently known about the analogous rates in active learning:
namely, the rate of convergence of the loss, as a function of the
number of label requests made by an active learning algorithm. In
this thesis proposal, I will outline some recent progress toward
understanding convergence rate improvements achievable by active
learning, along with general algorithms that achieve them. I will also
describe a few of the many open problems remaining on this topic,
which I hope to tackle in the near future.
Committee: Eric Xing (Chair), Avrim Blum, Sanjoy Dasgupta (UCSD), Larry Wasserman.
Proposal Document: http://www.cs.cmu.edu/~shanneke/docs/hanneke-proposal.pdf |