by R. Gomes, A. Krause

Abstract:

We consider the problem of extracting infor- mative exemplars from a data stream. Examples of this problem include exemplar- based clustering and nonparametric inference such as Gaussian process regression on mas- sive data sets. We show that these problems require maximization of a submodular function that captures the informativeness of a set of exemplars, over a data stream. We develop an efficient algorithm, Stream-Greedy, which is guaranteed to obtain a constant fraction of the value achieved by the optimal solution to this NP-hard optimization problem. We extensively evaluate our algorithm on large real-world data sets.

Reference:

Budgeted Nonparametric Learning from Data Streams R. Gomes, A. KrauseIn Proc. International Conference on Machine Learning (ICML), 2010

Bibtex Entry:

@inproceedings{gomes10budgeted, Author = {Ryan Gomes and Andreas Krause}, Booktitle = {Proc. International Conference on Machine Learning (ICML)}, Title = {Budgeted Nonparametric Learning from Data Streams}, Year = {2010}}