ML/Google Distinguished Lecture Series-Machine Learning Department - Carnegie Mellon University

Machine Learning Special Seminars

Fall 2016 Seminar
Date: December 12, 2016
Time: 2:30 PM - 3:30 PM
Location: 8102 Gates and Hillman Centers
Speaker: David Draper Professor, Baskin School of Engineering, University of California-Santa Cruz
Title: Optimal Bayesian Analysis of A/B Tests at Big-Data Scale
Abstract: Several foundational approaches to the discipline of statistics, including independent theorems by RT Cox and de Finetti, have the following form: *IF* you uniquely specify (a) a prior and (b) a sampling distribution, *THEN* optimal inference and prediction are possible, and *IF* in addition you uniquely specify (c) an action space and (d) a utility function, *THEN* optimal decision-making under uncertainty is possible. This immediately raises a question: Are there problems in which the *context* uniquely specifies any of the four ingredients (a-d)? [Answer: yes; I'll refer to such settings as examples of *optimal Bayesian model specification*] In this talk I'll show that optimal Bayesian analysis of data sets gathered with A/B testing at Big-Data scale is possible, via an application of Bayesian non-parametric modeling, and I'll also show how to do fast accurate computations under this model even with tens or hundreds of millions of observations.