by ,
Abstract:
We consider the problem of extracting infor- mative exemplars from a data stream. Examples of this problem include exemplar- based clustering and nonparametric inference such as Gaussian process regression on mas- sive data sets. We show that these problems require maximization of a submodular function that captures the informativeness of a set of exemplars, over a data stream. We develop an efficient algorithm, Stream-Greedy, which is guaranteed to obtain a constant fraction of the value achieved by the optimal solution to this NP-hard optimization problem. We extensively evaluate our algorithm on large real-world data sets.
Reference:
Budgeted Nonparametric Learning from Data Streams R. Gomes, A. KrauseIn Proc. International Conference on Machine Learning (ICML), 2010
Bibtex Entry:
@inproceedings{gomes10budgeted,
	Author = {Ryan Gomes and Andreas Krause},
	Booktitle = {Proc. International Conference on Machine Learning (ICML)},
	Title = {Budgeted Nonparametric Learning from Data Streams},
	Year = {2010}}