ML/Google Distinguished Lecture Series-Machine Learning Department - Carnegie Mellon University

Machine Learning Special Seminars

Fall 2016 Seminar
Date: September 26, 2016
Time: 10:00 AM - 00:00 AM
Location: 3305 Newell-Simon Hall
Speaker: Nihar Shah PhD Candidate EECS Dept. - Univ. of Berkeley
Title: A Permutation-based Model for Crowdsourcing: Optimal Estimation and Robustness
Abstract: The aggregation and denoising of crowd-labeled data is a task that has gained increased significance with the advent of crowdsourcing platforms and requirements of massive labeled datasets. In this paper, we propose a permutation-based model for crowd-labeled data that is a significant generalization of the popular "Dawid-Skene" model. Working in a high-dimensional non-asymptotic framework, we derive optimal rates of convergence for the permutation-based model. We show that the permutation-based model offers significant robustness in estimation due to its richness, while surprisingly incurring only a small statistical penalty as compared to the Dawid-Skene model. Finally, we propose a polynomial-time computable algorithm, called OBI-WAN, for provably efficient estimation under these models.
Speaker Bio: Nihar B. Shah is a final year PhD candidate at UC Berkeley, working with Martin Wainwright and Kannan Ramchandran. His research interests include statistics, machine learning and information theory, with applications to crowdsourcing. He is the recipient of the Microsoft Research PhD fellowship 2014-16, the Berkeley fellowship 2011-13, the IEEE Data Storage Best paper and Best student paper awards for the years 2011/2012, and the SVC Aiya Medal from the Indian Institute of Science in 2010.